8 citations found. Retrieving documents...
B. Hammer, "On the learnability of recursive data," Math. Contr., Signals Syst., to be published.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Some Contributions to Fixed-Distribution Learning Theory - Vidyasagar, Kulkarni (2000)   (Correct)

....an algorithm. Because, it follows from the triangle inequality that either or In any case, it follows that If the function class F however, is recursively enumerable, then the above procedure would indeed satisfy most persons as being a true algorithm. This part of the proof is taken from [16] and was independently suggested by one of the reviewers of the original version of the paper. This result is true for every algorithm. Because the argument can be repeated for every , it follows that for every algorithm we have Now, restore the subscript on , and label the set as . Because ....

B. Hammer, "On the learnability of recursive data," Math. Contr., Signals Syst., to be published.


Neural Networks With Small Weights Implement Finite Memory.. - Hammer, Tino (2002)   Self-citation (Hammer)   (Correct)

....from a finite training set which size depends only on the given function class [1, 39] Hence prior distribution independent bounds on the generalization ability of general recurrent networks are not possible. A first step towards posterior or distribution dependent bounds can be found in [13, 14], however, these guarantees are weaker than guarantees obtained via a finite VC dimension. Therefore, alternatives to recurrent networks or hidden Markov models have been investigated for which efficient training algorithm can be found. One possibility constitute networks with time window for ....

....Hence general recurrent networks do not yield valid generalization in the above sense contrarious to fixed length FMM. One can prove weaker results for recurrent networks, which yield bounds on the size of a training set such that valid generalization holds with high probability as derived in [13, 14], for example. However, these bounds are no longer independent of the underlying (unknown) distribution of the inputs. Training of general RNNs may need an exhaustive number of patterns for valid generalization and certain underlying input distributions. One particularly bad situation is ....

[Article contains additional citation context not shown here]

B. Hammer. On the learnability of recursive data. Mathematics of Control, Signals, and Systems, 12:62-79, 1999.


Neural Networks With Small Weights Implement Finite Memory.. - Hammer, Tino (2002)   Self-citation (Hammer)   (Correct)

....from a finite training set which size depends only on the given function class [1, 38] Hence prior distribution independent bounds on the generalization ability of general recurrent networks are not possible. A first step towards posterior or distribution dependent bounds can be found in [13, 14], however, these guarantees are weaker than guarantees obtained via a finite VC dimension. Therefore, alternatives to recurrent networks or hidden Markov models have been investigated for which efficient training algorithm can be found. One possibility constitute networks with time window for ....

....Hence general recurrent networks do not yield valid generalization in the above sense contrarious to fixed length FMM. One can prove weaker results for recurrent networks, which yield bounds on the size of a training set such that valid generalization holds with high probability as derived in [13, 14], for example. However, these bounds are no longer independent of the underlying (unknown) distribution of the inputs. Training of general RNNs may need an exhaustive number of patterns for valid generalization and certain underlying input distributions. One particularly bad situation is ....

[Article contains additional citation context not shown here]

B. Hammer. On the learnability of recursive data. Mathematics of Control, Signals, and Systems, 12:62-79, 1999.


Limitations of Hybrid Systems - Hammer (2000)   Self-citation (Hammer)   (Correct)

.... 1 E (0; restricted to inputs of height at most T is limited by O(N 3 T ln(qd) if the activation function in dec is piecewise polynomial with at most q pieces and degree at most d 2. The VC dimension is limited by O(N 4 ) if the activation function is the standard sigmoidal function [3,6]. In both cases N denotes the number of neurons in dec. The lower bound 2 T 1 for the VCdimension leads to the bound N = 2 for the neurons in dec. 2 Note that it is not important how the trees are encoded. Furthermore, a more sophisticated decoding of the single binary nodes or using other ....

Hammer, B. (1999) On the learnability of recursive data. MCSS 12:62-79.


Closure Properties of Uniform Convergence of Empirical.. - Vidyasagar, Balaji..   Self-citation (Hammer)   (Correct)

....5 The pair (F ; P) is said to have the shrinking width property if w(m; P) 0 as m 1 for each 0. It is known (see e.g. 8] Example 6.7) that consistent PAC learnability is a stronger requirement than just PAC learnability. However, for PUAC learnability, the situation is di erent. In [2], Lemma 4, the following extension of [8] Theorem 8.2 is proved: Theorem 3 Given F and P, the following statements are equivalent: 1. The pair (F ; P) has the shrinking width property. 2. The pair (F ; P) is consistently PUAC learnable. 3. The pair (F ; P) is PUAC learnable. 8 The main ....

....the following extension of [8] Theorem 8.2 is proved: Theorem 3 Given F and P, the following statements are equivalent: 1. The pair (F ; P) has the shrinking width property. 2. The pair (F ; P) is consistently PUAC learnable. 3. The pair (F ; P) is PUAC learnable. 8 The main contribution of [2] is thus in showing that if a pair (F ; P) is PUAC learnable, then in fact every consistent algorithm is PUAC. Note that actually the result in [2] is for a single xed probability. However, the same proof carries through for a family of probabilities, as can be easily veri ed. Now, using ....

[Article contains additional citation context not shown here]

B. Hammer, On the learnability of recursive data, Math. Control, Signals, and Systems, 12, 1999, 62-79.


Generalization Ability of Folding Networks - Hammer   Self-citation (Hammer)   (Correct)

....function class which is used for learning and not on the learning algorithm itself. Unfortunately the situation turns out to be more difficult in the recursive case than for standard feed forward networks. There exists some work which estimates the VC dimension of recurrent and folding networks [14, 19], the combinatorial quantity finiteness of which characterizes distribution independent learnability. For arbitrary inputs this dimension is infinite due to the unlimited input length. i.e. the ability of dealing with inputs of arbitrary size even leads to the ability of storing arbitrary ....

....bounds on the VC or pseudodimension d t of F jX t can be obtained by first substituting each input tree by an equivalent input tree with a maximum number of nodes, unfolding the network for these inputs, and applying the bounds from the feed forward case to these unfolded networks. For details see [14, 15]. This leads to the following bounds some of which can be found in [8, 14, 19] O(W ln(th) if oe is linear, O(W th ln d) if oe is a polynomial of degree d 2, O(WN W ln(W t) if oe = H; k = 1; O(WNk W t ln k W ln W ) if oe = H; k 2; if oe = ....

[Article contains additional citation context not shown here]

B. Hammer. On the learnability of recursive data. Mathematics of Control, Signals, and Systems, 12, 1999.


On the Approximation Capability of Recurrent Neural Networks - Hammer (1998)   Self-citation (Hammer)   (Correct)

....in the domain of structured, symbolic, or hybrid data [4] Here we deal with the third setting and consider the capability of recurrent networks of approximating an unknown function in principle. This capability together with the learnability of recurrent networks established for example in [6] and the existence of learning algorithms [21, 8] justify the use of recurrent networks in practical applications. Now we proceed as follows: After defining recurrent networks formally we proof their capability of approximating any measurable mapping in probability. If only a finite set of data is ....

....Of course, the function h could be included in the recursive part. But this notation has the advantage that the number of neurons needed for the recursive computation can be made explicit. This improves bounds on the number of examples necessary to train a network correctly with high reliability [6]. One example of a recurrent network and the computation performed on a special input is depicted in Figure 1. In contrast to feedforward networks, a recurrent network can deal with input sequences of arbitrary length. One can think of the recursive part as a mechanism of encoding the input ....

B. Hammer. On the learnability of recursive data, Mathematics of Control, Signals, and Systems 12 (1999) 62--79.


Limitations of Hybrid Systems - Hammer (2000)   Self-citation (Hammer)   (Correct)

No context found.

Hammer, B. (1999) On the learnability of recursive data. MCSS 12:62-79.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC