Results 1 
5 of
5
Infinitary Self Reference in Learning Theory
, 1994
"... Kleene's Second Recursion Theorem provides a means for transforming any program p into a program e(p) which first creates a quiescent self copy and then runs p on that self copy together with any externally given input. e(p), in effect, has complete (low level) self knowledge, and p represents ..."
Abstract

Cited by 20 (6 self)
 Add to MetaCart
Kleene's Second Recursion Theorem provides a means for transforming any program p into a program e(p) which first creates a quiescent self copy and then runs p on that self copy together with any externally given input. e(p), in effect, has complete (low level) self knowledge, and p represents how e(p) uses its self knowledge (and its knowledge of the external world). Infinite regress is not required since e(p) creates its self copy outside of itself. One mechanism to achieve this creation is a self replication trick isomorphic to that employed by singlecelled organisms. Another is for e(p) to look in a mirror to see which program it is. In 1974 the author published an infinitary generalization of Kleene's theorem which he called the Operator Recursion Theorem. It provides a means for obtaining an (algorithmically) growing collection of programs which, in effect, share a common (also growing) mirror from which they can obtain complete low level models of themselves and the other prog...
Noisy Inference and Oracles
, 1996
"... A learner noisily infers a function or set, if every correct item is presented infinitely often while in addition some incorrect data ("noise") is presented a finite number of times. It is shown that learning from a noisy informant is equal to finite learning with Koracle from a usual i ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
A learner noisily infers a function or set, if every correct item is presented infinitely often while in addition some incorrect data ("noise") is presented a finite number of times. It is shown that learning from a noisy informant is equal to finite learning with Koracle from a usual informant. This result has several variants for learning from text and using different oracles. Furthermore, partial identification of all r.e. sets can cope also with noisy input.
Vacillatory and BC Learning on Noisy Data
, 2007
"... The present work employs a model of noise introduced earlier by the third author. In this model noisy data nonetheless uniquely determines the true data: correct information occurs infinitely often while incorrect information occurs only finitely often. The present paper considers the effects of thi ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
The present work employs a model of noise introduced earlier by the third author. In this model noisy data nonetheless uniquely determines the true data: correct information occurs infinitely often while incorrect information occurs only finitely often. The present paper considers the effects of this form of noise on vacillatory and behaviorally correct learning of grammars — both from positive data alone and from informant (positive and negative data). For learning from informant, the noise, in effect, destroys negative data. Various noisydata hierarchies are exhibited, which, in some cases, are known to collapse when there is no noise. Noisy behaviorally correct learning is shown to obey a very strong “subset principle”. It is shown, in many cases, how much power is needed to overcome the effects of noise. For example, the best we can do to simulate, in the presence of noise, the noisefree, no mind change cases takes infinitely many mind changes. One technical result is proved by a priority argument.
NAISTISDT0161006 Doctor’s Thesis Statistical Learning from Multiple Information Sources
, 2004
"... In intelligent information processing tasks such as pattern recognition and information retrieval (IR), probabilistic models are now widely used because they can represent ambiguities of observed data and are robust against noises. Parameters of probabilistic models are statistically estimated (lear ..."
Abstract
 Add to MetaCart
(Show Context)
In intelligent information processing tasks such as pattern recognition and information retrieval (IR), probabilistic models are now widely used because they can represent ambiguities of observed data and are robust against noises. Parameters of probabilistic models are statistically estimated (learned) from given training data. However, when the training data contain an insufficient amount of information, the learned model becomes unreliable and its performance severely deteriorates. This thesis proposes two novel learning algorithms that use multiple information sources to mitigate this information scarcity problem in the following two applications. The first application is solving classification problems in which optimal class labels are automatically assigned to observations whose class labels are unknown. Among various types of classification problems, this paper considers classification of sequences that consist of sequential observation points. As a classifier, we focus on the hidden Markov model (HMM), which has been widely used for the classification of sequences. Generally, an HMM is trained on labeled data that consist of observed feature values and class labels. However, due to the high labeling cost, the amount of labeled training data is often small. In this thesis, we propose a learning
Learning from Streams
"... Learning from streams is a process in which a group of learners separately obtain information about the target to be learned, but they can communicate with each other in order to learn the target. We are interested in machine models for learning from streams and study its learning power (as measure ..."
Abstract
 Add to MetaCart
(Show Context)
Learning from streams is a process in which a group of learners separately obtain information about the target to be learned, but they can communicate with each other in order to learn the target. We are interested in machine models for learning from streams and study its learning power (as measured by the collection of learnable classes). We study how the power of learning from streams depends on the two parameters m and n, where n is the number of learners which track a single stream of input each and m is the number of learners (among the n learners) which have to find, in the limit, the right description of the target. We study for which combinations m, n and m ′ , n ′ the following inclusion holds: Every class learnable from streams with parameters m, n is also learnable from streams with parameters m ′ , n ′. For the learning of uniformly recursive classes, we get a full characterization which depends only on the ratio m; but for general classes the picture is more complin cated. Most of the noninclusions in team learning carry over to noninclusions with the same parameters in the case of learning from streams; but only few inclusions are preserved and some additional noninclusions hold. Besides this, we also relate learning from streams to various other closely related and wellstudied forms of learning: iterative learning from text, learning from incomplete text and learning from noisy text.