### Table 2: Architectural description of di erent recurrent network architecture used for the tree au- tomata problem. control the span of long-term dependencies, in which the output will depend on input values far in the past. For this experiment all inputs were encoded into one input neuron with the 2 alphabets encoded respectively as 0 and 1. For each simulation, we randomly generated a training set and an indepen- dent testing set, each consisting of 500 strings of length T such that there were an equal number of positive and negative strings. We varied T from 10 to 30. For the accepted strings, a target value of 0:8 was chosen, for the rejected strings ?0:8 was chosen. All other experimental parameters were the same as the previous experiment. Because memory order LR(1) networks were experimentally unable to learn sequences of length greater than 10, di erent LS networks were used. Table 2 shows all the architectures used in this experiment. The network was trained by using a simple BPTT algorithm with a learning rate 0:01 for a maximum of 200 epochs. If the simulation exceeded 200 epochs and did not correctly classify all

### Table 1: Toy data set results. timated). Transitions that are not required to parse any string in D+ are eliminated. Fitness criteria 1-3 are weighted (re ecting their rela- tive importance to tness evaluation). These weights can be manipulated to directly a ect the structural and functional properties of au- tomata.

1998

"... In PAGE 2: ... Di erent com- binations of weights for the three tness crite- ria were tested. Table1 gives results for the best weight combination averaged over 10 runs di ering only in random population initialisa- tion (and results averaged over all other weight combinations in brackets). The rst column in- dicates how many successful runs there were out of 10 for the best weight combination (and the average for all weight combinations in brack- ets).... ..."

Cited by 3

### Table 1. Experiments

2000

"... In PAGE 13: ... The length of the channels can of course be changed. However, we have not gured out how to model and analyze the case where both the channels and the sequence numbers are unbounded In Table1 , we show for each algorithm the domains of the variables that are in nite, the number of steps required to generate the set of reachable con gura- tions, the size of the transducer, the maximumnumber of states among automata generated during analysis, and the maximum number of BDD nodes among au- tomata generated during analysis. Note that all automata are deterministic.... ..."

Cited by 46

### Table 1 summarizes the complexity of determining the existence of stationary and history-dependent policies for UMDPs, MDPs and POMDPs. In the average rewards case the existence of history-dependent and stationary policies for MDPs coincide. The undecidability of UMDP and POMDP policy existence with history-dependent policies of unre- stricted size was shown by Madani et al. [1999]. The result is based on the emptiness problem of probabilistic finite au- tomata [Paz, 1971; Condon and Lipton, 1989] that is closely related to the unobservable plan existence problem. The results do not completely determine the complexity of the UMDP stationary policy existence problem, but as the sta- tionary UMDP policies repeatedly apply one single operator, the problem does not seem to have the power of EXP. It is also not trivial to show membership in PSPACE.

2001

"... In PAGE 4: ...history-dependent UMDP PSPACE-hard, in EXP (L8,9) undecidable MDP EXP (T11) EXP (C12) POMDP NEXP (T13) undecidable Table1 : Complexity of policy existence, with references to the lemmata, theorems, and corollaries. B.... In PAGE 4: ... It is also not trivial to show membership in PSPACE. The rest of the paper formally states the results summarized in Table1 and gives their proof outlines. Lemma 8 Existence of a policy with average reward r c for UMDPs, MDPs and POMDPs with only one action is PSPACE-hard.... ..."

Cited by 2

### Table 2: Architectural description of di erent recurrent network architecture used for the tree au- tomata problem. We used the hyperbolic tangent function in each neuron. prespeci ed time t. For instance, Figure 6 shows a ve-state automaton, in which the class of each string is determined by the third input symbol. When that symbol is \1 quot;, the string is accepted; otherwise, it is rejected. By increasing the length of the strings to be learned, we will be able to control the span of long-term dependencies, in which the output will depend on input values far in the past. For this experiment all inputs were encoded into one input neuron with the 2 alphabets encoded respectively as 0 and 1. For each simulation, we randomly generated a training set and an indepen- dent testing set, each consisting of 500 strings of length T such that there were an equal number of positive and negative strings. We varied T from 10 to 30. For the accepted strings, a target value of 0:8 was chosen, for the rejected strings ?0:8 was chosen. All other experimental parameters were the same as the previous experiment. Because memory order LR(1) networks were experimentally unable to learn sequences of length greater than 10, di erent LS networks were used. Table 2 shows all the architectures used in this

### Table 1: Actions used in current EA model

1995

"... In PAGE 2: ... For example the rule, B A C ! DESTRUCT speci es that if automa- ton B has A to its right, C to its left, and no others ( denotes an empty cell), then it should cease to exist at the next time-step. The actions used in the current EA model are listed in Table1 . Au- tomata may move (both translation and rotation are included in the same action for convenience), divide into two copies (again movement is included for convenience), self-destruct, or remain inactive.... ..."

Cited by 16

### Table 1 The sizes of the minimal automata and the minimal Fischer automata for all essential (3; 2)-CAs . (L) F (L)

1997

"... In PAGE 12: ... di erent equivalence relation; e.g., there are 84 conjugacy classes of binary CAs of width 3. Table1 lists the sizes of the minimal automata and the minimal Fischer au- tomata for all 72 essential binary CAs of width 3. As one can see from the table, there are 4 essential (3; 2)-CAs with maximum blow-up, corresponding to a total of 16 (3; 2)-CAs.... ..."

Cited by 11

### Table 2: Automata sizes for the three parts of speech in the Many-Automata case, with alpha =

"... In PAGE 6: ...0002 for all parts of speech, which, for most parts of speech, produces better au- tomata than bigrams. Table2 lists the sizes of the automata. The differences between MDI-based and bigram-based automata are not as dramatic as in the One-Automaton case (Table 1), but the former again have consistently lower NumEdges and Num- States values, for all parts of speech, even where... ..."

### Table 3: Number of rules in the grammars built.

"... In PAGE 7: ... Since each automaton is transformed into a PCFG, the number of rules in the resulting grammar is proportional to the number of arcs in the automaton, and the number of non- terminals is proportional to the number of states. From Table3 we see that MDI compresses informa- tion better: the sizes of the grammars produced by the MDI-based automata are an order of magnitude smaller that those produced using bigram-based au- tomata. Moreover, the One-Automaton versions substantially reduce the size of the resulting gram- mars; this is obviously due to the fact that all POS share the same underlying automaton so that infor- mation does not need to be duplicated across parts... ..."