| D. Osherson, M. Stob, and S. Weinstein. Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, Cambridge, Mass., 1986. |
.... and accuracy results in the model of learning in the limit introduced by Gold [11] During the last three decades much has been learned about the classes of formal languages and partial recursive functions that can successfully learned within Gold s [11] model and variations thereof (cf. e.g. [2, 5, 7, 8, 18, 24, 25, 27, 31]) We continue along these lines of research. In particular, we aim to investigate the principal learning capabilities of learners which perform incremental learning. For the purpose of motivation and discussion of our research, we introduce some notations. A positive presentation of a concept c ....
....to produce its actual guess exclusively from its previous one and the next element in the positive presentation. Iterative learning has been introduced by Wiehagen [26] who studied it in the setting of learning recursive functions. Further results concerning this learning model can be found in [7, 8, 13, 14, 18, 19, 25, 26, 29, 31]. Osherson et al. 18] also considered the variant that the learner has access to the last k elements, where k is a priori fixed. Interestingly enough, the latter approach does not increase the learning power. Alternatively, Fulk et al. 7] considered learners that are allowed to store k carefully ....
[Article contains additional citation context not shown here]
D. Osherson, M. Stob and S. Weinstein, "Systems that Learn, An Introduction to Learning Theory for Cognitive and Computer Scientists," MIT Press, Cambridge, Mass., 1986.
....a sample S if S L( A (one variable) pattern is called descriptive for S if it is consistent with S and there is no other consistent (onevariable) pattern such that L( ae L( Next, we define the relevant learning models. We start with inductive inference of languages from text (cf. e.g. [15, 22]) Let L be a language; every infinite sequence t = s j ) j2IN of strings with range(t) fs j j 2 INg = L is said to be a text for L. Text(L) denotes the set of all texts for L. Let t be a text, and r 2 IN. We set t r = s 0 ; s r , and call t r the initial segment of t of length r 1. ....
D. Osherson, M. Stob and S. Weinstein. Systems that learn: An introduction to learning theory for cognitive and computer scientists. MIT Press, 1986
....of learning functions. To be more precise: in [9] it was shown that any unbounded monotone increasing update boundary is not by itself restrictive. It is interesting to note that a conservative (and prudent) learner that is consistent on its class is text ecient (Proposition 8.2. 2 A in [17]) 2 Classical Categorial Grammar and Structure Languages The classes de ned in [5, 7] are based on a formalism for ( free) context free languages called classical categorial grammar (CCG) In this section the relevant concepts of CCG will be de ned. I will adopt notation from [12] In ....
D. N. Osherson, M. Stob, and S. Weinstein. Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, Cambridge, MA., 1986.
.... R) C 0 is recursively enumerable]g. For references on inductive inference within NUM, the set of all recursively enumerable classes and their subclasses, the reader is referred to [15, 3, 12] For references surveying the general theory of learning recursive functions, we refer the reader to [1, 5, 10, 11, 24, 32, 19]. Refutable Inductive Inference of Recursive Functions 7 2.2. Learning Refutably In this subsection we introduce learning with refutation. The idea is that the learning machine should refute functions which it does not identify. We consider three versions of refutation based on how the machine ....
D. Osherson, M. Stob, and S. Weinstein. Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, 1986.
....of some (computable) learning function be de ned as the number of computing steps it takes to learn a language, with respect to j j, the size of the input sequence. In (Pitt, 1989) it was rst noted that requiring the function to run in 2 In fact a whole section devoted to this subject in (Osherson et al. 1986) has been completely omitted from the second edition (Jain et al. 1999) a time polynomial with respect to j j does not constitute a signi cant constraint, since one can always de ne a learning function 0 that combines with a clock so that its amount of computing time is bounded by a ....
....for ecient learning. The consistency and conservatism requirements ensure that the update procedure really takes all input into account. It is interesting to note that a conservative (and prudent) learner that is consistent on its class is text ecient (see Proposition 8.2. 2 A, page 172 of (Osherson et al. 1986)) Therefore, the conservative learning functions k valued , least valued and least card de ned in (Kanazawa, 1998) that are consistent on their class are all text ecient. This de nition was applied in (Arimura et al. 1992) to analyze the complexity of learning a subclass of context free ....
D. N. Osherson, M. Stob, and S. Weinstein. 1986. Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, Cambridge, MA.
....(L, L # ) i# for any text T for L and any text T # for L # , M Resep identifies (T, T # ) c) M Resep identifies L i# M Resep identifies all pairs (L, L # ) where L and L # are disjoint sets in L. d) Resep = L L # Disjoint # (#M) M Resep identifies L] Definition 6 [5, 7, 12, 17, 19, 25] (a) M is Popperian i# for all #, # # such that content(#) # content(# # ) # and # = # # , M(#,# # ) is total. b) M is consistent i# for all #, # # such that content(#) # content(# # ) # and # = # # , content(#) # # 1 M(#,# # ) 0) and content(# # ) # # 1 ....
....such that M i does not Resep identify (L i,f i , L # i,f i ) Let L C = L i,f i i # N # L # i,f i i # N . It follows that L C ## Reliablesep. # of Part (a) follows. Proposition 16 permits now to transfer the following noninclusions from the theory of learning functions [1, 4, 8, 17, 19, 21, 22, 24] to the theory of learning separations. Corollary 17 Recsep ## Reliablesep. Reliablesep ## Confidentsep. Confidentsep ## Reliablesep. Confidentsep ## Finitesep. Osherson, Stob and Weinstein [19, Exercise 4.4.2C] noted that a class which consists only of infinite sets has already a ....
D. Osherson, M. Stob, and S. Weinstein. Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, 1986.
....At this moment, we do not see any choice to this end that seems to be superior to the others. Indeed, each approach appears justified if it yields interesting results. For references surveying the theory of learning recursive functions, the reader is referred to [AS83, BB75, CS83, Fre91, KW80, OSW86, JORS99] 2 Notation and Preliminaries Recursion theoretic concepts not explained below are treated in [Rog67] N denotes the set of natural numbers. denotes a non member of N and is assumed to satisfy (8n) n 1] Let 2; ae; oe, respectively denote the membership, subset, proper ....
....C]g. c.1) M RConf identifies C iff M is total, and M Conf identifies C. c.2) RConf = fC j (9M) M RConf identifies C]g. d.1) Ful88] M T Conf identifies C iff M is conforming on each f 2 T , and M Conf identifies C. d.2) T Conf = fC j (9M) M T Conf identifies C]g. Definition 6 [OSW86] M is confident iff for all total f , M(f)#. M Confident identifies C iff M is confident and M Ex identifies C. Confident = fC j (9M) M Confident identifies C]g. 6 Definition 7 [Min76, BB75] M is reliable iff for all total f , M(f)# )M Ex identifies f . M Reliable identifies C iff M is ....
D. Osherson, M. Stob, and S. Weinstein. Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, 1986.
....in the philosophy of science. However, within the last three decades it received much attention from computer scientists. Nowadays inductive inference can be considered as a form of machine learning with potential applications to artificial intelligence (cf. e.g. Angluin and Smith (1983, 1987) Osherson, Stob and Weinstein (1986)) The present paper deals with inductive inference of formal languages, a field in which many interesting and sometimes surprising results have been obtained (cf. e.g. Case and Lynes (1982) Case (1988) Fulk (1990) Looking at potential applications it seemed reasonable to restrict ourselves ....
....(M) analogously. Finally, let LIM0TXT (LIM0INF ) denote the collection of all indexed families L of recursive languages for which there is an IIM M such that L LIM0TXT (M) L LIM0INF (M) Definition 1 could be easily generalized to arbitrary families of recursively enumerable languages (cf. Osherson et al. 1986)) Nevertheless, we exclusively consider the restricted case defined above, since our motivating examples are all indexed families of recursive languages. Moreover, it may be well conceivable that the weakening of L = fL(G j ) j j 2 IN g to L fL(G j ) j j 2 IN g may increase the ....
Osherson, D., Stob, M., and Weinstein, S. (1986), "Systems that Learn, An Introduction to Learning Theory for Cognitive and Computer Scientists," MIT-Press, Cambridge, Massachusetts.
....has to deal with the limitations of space available. Given this, a systematic study of refinements of concept learning in the limit that are considerably restricting the accessibility of the input data stream seems to be important. Our study extends previous work done by several authors (cf. [21, 9, 18, 4]) who studied different versions of incremental learning from noise free data. In contrast, we have studied incremental learning of indexable concept classes from noisy data. Our study is based on the model of noisy data developed by Stephan [20] that seems to become standard in this area of ....
D.N. Osherson, M. Stob, and S. Weinstein, "Systems that Learn, An Introduction to Learning Theory for Cognitive and Computer Scientists," MIT-Press, Cambridge, Mass., 1986.
....as a process of gathering an information about an unknown object, processing this information and obtaining description of the unknown object. Ideally, we would like to obtain a complete description of the object. There are several things to be specified if we want to make this model precise[3, 22]. These are: ffl What is the class of objects that we consider ffl What data are available How are these data represented to a learning algorithm ffl What is the form of description that the learning algorithm outputs (Boolean formula, program, etc. ffl When does the learning algorithm ....
....identifiable by probabilistic algorithms with different probabilities of correct answer. 2 Definitions Next, we introduce the formal notation and definitions used in this paper. For more background information, see [25] for recursive function (computability) theory, 26, 18] for set theory and [3, 22] for inductive inference. A learning machine is an algorithmic device that reads values of a function f : f(0) f(1) Having seen finitely many values of the function it can output a conjecture. A conjecture is a program in some fixed acceptable programming system[19, 25] Only one ....
[Article contains additional citation context not shown here]
D. Osherson, M. Stob, and S. Weinstein. Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, 1986.
.... computer program [77, 50] and Gold s model of language learning from text (positive information) by machine has been very influential in contemporary theories of natural language and in mathematical work explicitly motivated by its possible connection to human language learning (see, for example, [76, 93, 94, 66, 68, 8, 44, 15, 69, 70, 38, 39, 53, 5]) In the present paper we consider some new criteria of success extending Gold s basic model above. Suppose that we fix an integer n 0. Consider the following criterion of success (again based on (1.1) above) We say that M TxtFex n identifies L def # M, on any text for L, outputs ....
....the restriction to recursive texts. Theorem 5.5 is the hardest theorem herein to prove, and the other theorems in section 5 are proved by modifications and or simplifications of the proof of Theorem 5.5. Some of the theorems in section 5 generalize predecessors for TxtFex 0 1 identification [3, 93, 90, 38, 70, 39] but are much harder to prove. Some of the theorems in section 5 are applied in the present paper and in other papers. In section 7 we discuss briefly computable universe hypotheses, present some critical discussion as promised above, and sketch some areas for future investigation. 2. ....
[Article contains additional citation context not shown here]
D. Osherson, M. Stob, and S. Weinstein, Systems That Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists, MIT Press, Cambridge, MA, 1986.
....originally in the context of inductive inference. Research on inductive inference is concerned with formalizing and analyzing the process of gradually learning concepts from successively larger sets of examples. Frequently, the concepts are taken to be formal languages (cf. e.g. Osherson et al. [24], Zeugmann and Lange [39] In this case, a distinction is made between learning from informant and learning from text. If L is the language to be identified, every sequence i = s 0 ; b 0 ) s 1 ; b 1 ) s 2 ; b 2 ) with b j 2 f ; Gammag for all j 2 N satisfying fs j j j 2 Ng = A , ....
D. Osherson, M. Stob and S. Weinstein, Systems that learn: An introduction to learning theory for cognitive and computer scientists (MIT Press, Cambridge, MA, 1986).
....functions, i.e. of all those functions that compute a program for themselves on input 0 . Clearly, SD is EX learnable. Since Gold s [11] pioneering paper a huge variety of learning criteria have been proposed within the framework of inductive inference of recursive functions (cf. e.g. [3, 6, 8, 9, 15, 19, 21]) By comparing these inference criteria to one another, it became popular to show separation results by using function classes with self referential properties. On the one hand, the proof techniques developed are mathematically quite elegant. On the other hand, these separating examples may be ....
S. Jain, D. Osherson, J.S. Royer and A. Sharma. Systems That Learn: An Introduction to Learning Theory. MIT-Press, Boston, MA., 1999.
....The following theorem clarifies the relation between Gold s [18] classical learning in the limit and TxtFex inference. The assertion remains true even if the learner is only allowed to vacillate between up to 2 descriptions, i.e. in the case jDj 2 (cf. Case [9, 11] Theorem 1 (Osherson et al. [31]; Case [9, 11] TxtEx a ae TxtFex a , for all a 2 IN[ fg. Looking at the above definitions, we see that an IIM M has always access to the whole history of the learning process, i.e. in order to compute its actual guess M is fed all examples seen so far. In contrast to that, next we define ....
....that c = a h j provided that A n k truthfully answers the questions computed by Q k (i.e. the j th component of A n k (Q k (M n (T ) x n 1 ) is 1 iff the j th component of Q k (M n (T ) x n 1 ) appears in T n . 4 Our definition is a variant of one found in Osherson, Stob and Weinstein [31] and Fulk et al. 17] which will be discussed later. 8 Finally, M TxtFb k Ex a H infers C iff there is computable mapping Q k as described above such that, for each c 2 C, M TxtFb k Ex a H identifies c. The resulting learning types TxtFb k Ex a H and TxtFb k Ex a are defined ....
[Article contains additional citation context not shown here]
D.N. Osherson, M. Stob, and S. Weinstein, "Systems that Learn, An introduction to Learning Theory for Cognitive and Computer Scientists," MIT Press, Cambridge, MA, 1986.
....universe, all data sequences, even noisy ones, are computable. One of the motivations for considering possibly non computable data sequences is that, in the case of child language learning, the utterances the child hears (as its data) may, in part, be determined by uncomputable random processes [OSW86] perhaps external to the utterance generators (e.g. the parents) The limit recursive functions are in between the computable and the arbitrarily uncomputable. Here is the idea. Informally, they are (by definition) the functions computed by limitprograms, programs which do not give correct ....
D. Osherson, M. Stob, and S. Weinstein. Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, 1986.
....and to ask whether any member of that class will eventually be identified as correct, using a Popperian algorithm in which all the theories are placed in a fixed order (usually simplest first ) and falsified one by one. A thorough study of identification algorithms and their power may be found in (Osherson et al. 1986). Unfortunately, the theory of identification in the limit does not tell us much about the predictive power of a given hypothesis; furthermore, the numbers of examples required for identification are often astronomical. B Simplicity and Kolmogorov complexity The idea of simplicity certainly ....
Osherson, D., Stob, M., & Weinstein, S. (1986). Systems that learn: An introduction to learning theory for cognitive and computer scientists. Cambridge, MA: MIT Press.
....of researchers in various fields. When dealing with learning computer scientists are mainly interested in studying the question whether or not learning problems may be solved algorithmically. Nowadays, algorithmic learning theory is a rapidly emerging science, cf. Angluin and Smith (1983, 1987) Osherson, Stob and Weinstein (1986). Nevertheless, despite the enormous progress having been made since the pioneering papers of Solomonoff (1965) and of Gold (1965, 1967) there are still many problems that deserve special attention. The global question we shall deal with may be posed as follows: Are all data of equal importance a ....
Osherson, D., Stob, M., and Weinstein, S. (1986), "Systems that Learn, An Introduction to Learning Theory for Cognitive and Computer Scientists," MIT-Press, Cambridge, Massachusetts.
....the notion of learning in the limit. In particular he considered a machine, which reads more and more information on an r.e. set and produces in the limit a grammar to generate this set. This is called Ex style identification. From then on many variants of this concept had been considered [2, 6, 12, 18]. Barzdin [4] and Case and Smith [13] considered the notion of behaviorally correct inference which is motivated by the fact, that a recursive machine cannot check the equivalence of grammars. So the learner can learn more languages, if infinitely many guesses are allowed under the condition, that ....
....other hand, if one is missing negative information [7, 8, 9] or has suitable complexity constraints [10] then vacillatory inference increases learning power. Many real world applications of learning or inductive inference have to deal with faulty data, so it is natural to study this phenomenon [3, 14, 18]. Many of these notions of noise have the disadvantage that noisy data does not specify uniquely the object to be learned. Stephan [20] introduced a notion of noise in order to overcome this difficulty: correct information occurs infinitely often while incorrect information occurs only finitely ....
[Article contains additional citation context not shown here]
Osherson, D., Stob, M., and Weinstein, S. (1986), "Systems that Learn, An Introduction to Learning Theory for Cognitive and Computer Scientists," MIT-Press, Cambridge, Massachusetts.
....BB75] oe 2 SEQ is said to be a stabilizing sequence for M on L, iff content(oe) L, and for all such that oe and content( L, M(oe) M( oe 2 SEQ is said to be a TxtEx a locking sequence for M on L, iff oe is a stabilizing sequence for M on L, and WM(oe) a L. Lemma 1 (based on [BB75, JORS99]) Suppose a 2 N [ fg. If M TxtEx a identifies L, then there exists a stabilizing sequence for M on L, and every stabilizing sequence for M on L is a TxtEx a locking sequence for M on L. Definition 14 Suppose M is a rearrangement independent and order independent learning machine. Let S 2 ....
S. Jain, D. Osherson, J. Royer, and A. Sharma. Systems that Learn: An Introduction to Learning Theory. MIT Press, Cambridge, Mass., second edition, 1999.
No context found.
D. Osherson, M. Stob, and S. Weinstein. Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, Cambridge, Mass., 1986.
No context found.
D. Osherson, M. Stob and S. Weinstein, Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists, MIT Press, Cambridge, Mass., 1986.
No context found.
D. Osherson, M. Stob, and S. Weinstein. Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, Cambridge, Mass., 1986.
No context found.
D. Osherson, M. Stob, S. Weinstein, Systems that Learn: An Introduction to Learning Theory for Cognitive and Computer Scientists. MIT Press, Cambridge, MA, 1986
No context found.
Osherson, D., Stob, M., and Weinstein, S. (1986), "Systems that Learn, An Introduction to Learning Theory for Cognitive and Computer Scientists," MIT-Press, Cambridge, Massachusetts.
No context found.
Osherson, D., Stob, M., and Weinstein, S. (1986), "Systems that Learn, An Introduction to Learning Theory for Cognitive and Computer Scientists," MIT Press, Cambridge, Massachusetts.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC