Results 1 - 10
of
25
Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification
"... This paper addresses the problem of learning to map sentences to logical form, given training data consisting of natural language sentences paired with logical representations of their meaning. Previous approaches have been designed for particular natural languages or specific meaning representation ..."
Abstract
-
Cited by 89 (17 self)
- Add to MetaCart
This paper addresses the problem of learning to map sentences to logical form, given training data consisting of natural language sentences paired with logical representations of their meaning. Previous approaches have been designed for particular natural languages or specific meaning representations; here we present a more general method. The approach induces a probabilistic CCG grammar that represents the meaning of individual words and defines how these meanings can be combined to analyze complete sentences. We use higher-order unification to define a hypothesis space containing all grammars consistent with the training data, and develop an online learning algorithm that efficiently searches this space while simultaneously estimating the parameters of a log-linear parsing model. Experiments demonstrate high accuracy on benchmark data sets in four languages with two different meaning representations. 1
A Probabilistic Model of Syntactic and Semantic Acquisition from Child-Directed Utterances and their Meanings
"... This paper presents an incremental probabilistic learner that models the acquistion of syntax and semantics from a corpus of child-directed utterances paired with possible representations of their meanings. These meaning representations approximate the contextual input available to the child; they d ..."
Abstract
-
Cited by 29 (6 self)
- Add to MetaCart
(Show Context)
This paper presents an incremental probabilistic learner that models the acquistion of syntax and semantics from a corpus of child-directed utterances paired with possible representations of their meanings. These meaning representations approximate the contextual input available to the child; they do not specify the meanings of individual words or syntactic derivations. The learner then has to infer the meanings and syntactic properties of the words in the input along with a parsing model. We use the CCG grammatical framework and train a non-parametric Bayesian model of parse structure with online variational Bayesian expectation maximization. When tested on utterances from the CHILDES corpus, our learner outperforms a state-of-the-art semantic parser. In addition, it models such aspects of child acquisition as “fast mapping,” while also countering previous criticisms of statistical syntactic learners. 1
A quantitative evaluation of naturalistic models of language acquisition; the efficiency of the Triggering Learning Algorithm compared to a Categorial Grammar Learner. Coling 2004
- In W. Sakas (Ed.), Proceedings of the COLING 2004 Workshop “Psychocomputational Models of Human Language Acquisition” Geneva: COLING
, 2004
"... Naturalistic theories of language acquisition assume learners to be endowed with some innate language knowledge. The purpose of this innate knowledge is to facilitate language acquisition by constraining a learner’s hypothesis space. This paper discusses a naturalistic learning system (a Categorial ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Naturalistic theories of language acquisition assume learners to be endowed with some innate language knowledge. The purpose of this innate knowledge is to facilitate language acquisition by constraining a learner’s hypothesis space. This paper discusses a naturalistic learning system (a Categorial Grammar Learner (CGL)) that differs from previous learners (such as the Triggering Learning Algorithm (TLA) (Gibson and Wexler, 1994)) by employing a dynamic definition of the hypothesis-space which is driven by the Bayesian Incremental Parameter Setting algorithm (Briscoe, 1999). We compare the efficiency of the TLA with the CGL when acquiring an independently and identically distributed English-like language in noiseless conditions. We show that when convergence to the target grammar occurs (which is not guaranteed), the expected number of steps to convergence for the TLA is shorter than that for the CGL initialized with uniform priors. However, the CGL converges more reliably than the TLA. We discuss the trade-off of efficiency against more reliable convergence to the target grammar. 1
The subset principle in syntax: costs of compliance. Linguistics
, 2005
"... draw attention here to some unsolved problems in the application of SP to syntax acquisition. While noting connections to formal results in computational linguistics, our focus is on how SP could be implemented in a way that is both linguistically well-grounded and psychologically feasible. We conce ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
draw attention here to some unsolved problems in the application of SP to syntax acquisition. While noting connections to formal results in computational linguistics, our focus is on how SP could be implemented in a way that is both linguistically well-grounded and psychologically feasible. We concentrate on incremental learning (with no memory for past inputs), which is now widely assumed in psycholinguistics. However, in investigating its interactions with SP, we uncover the rather startling fact that incremental learning and SP are incompatible, given other standard assumptions. We set out some ideas for ways in which they might be reconciled. Some seem more promising than others, but all appear to carry severe costs in terms of computational load, learning speed or memory resources. The penalty for disobeying SP has long been understood. In future language acquisition research it will be important to address the costs of obeying SP.
Learning to Distinguish PP Arguments from Adjuncts
"... Words differ in the subcategorisation frames in which they occur, and there is a strong correlation between the semantic arguments of a given word and its subcategorisation frame, so that all its arguments should be included in its subcategorisation frame. One problem is posed by the ambiguity betwe ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Words differ in the subcategorisation frames in which they occur, and there is a strong correlation between the semantic arguments of a given word and its subcategorisation frame, so that all its arguments should be included in its subcategorisation frame. One problem is posed by the ambiguity between locative prepositional phrases as arguments of a verb or adjuncts. As the semantics for the verb is the same in both cases, it is difficult to differentiate them, and to learn the appropriate subcategorisation frame. We propose an approach that uses semantically motivated preposition selection and frequency information to determine if a locative PP is an argument or an adjunct. In order to test this approach, we perform an experiment using a computational learning system that receives as input utterances annotated with logical forms. The results obtained indicate that the learner successfully distinguishes between arguments (obligatory and optional) and adjuncts.
The Acquisition of Word Order by a Computational Learning System
- In
, 2000
"... The purpose of this work is to investigate the process of grammatical acquisition from data. We are using a computational learning system that is composed of a Universal Grammar with associated parameters, and a learning algorithm, following the Principles and Parameters Theory. The Universal Gramma ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
The purpose of this work is to investigate the process of grammatical acquisition from data. We are using a computational learning system that is composed of a Universal Grammar with associated parameters, and a learning algorithm, following the Principles and Parameters Theory. The Universal Grammar is implemented as a Unification-Based Generalised Categorial Grammar, embedded in a default inheritance network of lexical types. The learning algorithm receives input from a corpus annotated with logical forms and sets the parameters based on this input. This framework is used as basis to investigate several aspects of language acquisition. In this paper we are concentrating on the acquisition of word order for different learners. The results obtained show the different learners having a similar performance and converging towards the target grammar given the input data available, regardless of their starting points. It also shows how the amount of noise present in the input data affects t...
Computational Grammar Acquisition from CHILDES Data Using a Probabilistic Parsing Model
- In Submitted
, 2009
"... In this work we propose a universal model of syntactic acquisition that assumes the learner is exposed to pairs consisting of strings of word-candidates and contextually-afforded meaning-representations. ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
In this work we propose a universal model of syntactic acquisition that assumes the learner is exposed to pairs consisting of strings of word-candidates and contextually-afforded meaning-representations.
The Use of Default Unification in a System of Lexical Types
, 2000
"... In this paper we describe the encoding of a Unication-Based Generalised Categorial Grammar for English, in terms of a default inheritance network of types, implemented with yadu, which is an order independent default unication operation on typed feature structures. We then propose to use this framew ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In this paper we describe the encoding of a Unication-Based Generalised Categorial Grammar for English, in terms of a default inheritance network of types, implemented with yadu, which is an order independent default unication operation on typed feature structures. We then propose to use this framework to encode a Universal Grammar and associated parameters, following the Principles and Parameters Theory [Chomsky 1981] and describe how they are implemented. This is a clear and concise way of dening the UG with the parameters being straightforwardly dened in the type network, in a way which takes advantage of the default inheritance mechanism, to propagate information about parameters, throughout the lexical inheritance network. The parameters are set based on exposure to a particular language, and that is obtained by using a corpus of annotated sentences to set the parameters. The corpus was parsed and annotated using the implemented English grammar. The resulting formalisation is being used as the basis to perform studies about language acquisition. 1
Formal Grammars of Early Language
"... Abstract. We propose to model the development of language by a series of formal grammars, accounting for the linguistic capacity of children at the very early stages of mastering language. This approach provides a testbed for evaluating theories of language acquisition, in particular with respect to ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Abstract. We propose to model the development of language by a series of formal grammars, accounting for the linguistic capacity of children at the very early stages of mastering language. This approach provides a testbed for evaluating theories of language acquisition, in particular with respect to the extent to which innate, language-specific mechanisms must be assumed. Specifically, we focus on a single child learning English and use the CHILDES corpus for actual performance data. We describe a set of grammars which account for this child’s utterances between the ages 1;8.02 and 2;0.30. The coverage of the grammars is carefully evaluated by extracting grammatical relations from induced structures and comparing them with manually annotated labels. 1