Results 1 - 10
of
14
A Rule-based Acquisition Model Adapted for Morphological Analysis ⋆
"... Abstract. We adapt the cognitively-oriented morphology acquisition model proposed in (Chan 2008) to perform morphological analysis, extending its concept of base-derived relationships to allow multi-step derivations and adding features required for robustness on noisy corpora. This results in a rule ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
(Show Context)
Abstract. We adapt the cognitively-oriented morphology acquisition model proposed in (Chan 2008) to perform morphological analysis, extending its concept of base-derived relationships to allow multi-step derivations and adding features required for robustness on noisy corpora. This results in a rule-based morphological analyzer which attains an F-score of 58.48 % in English and 33.61 % in German in the Morpho Challenge 2009 Competition 1 evaluation. The learner’s performance shows that acquisition models can effectively be used in text-processing tasks traditionally dominated by statistical approaches. 1
Who’s Afraid of George Kingsley Zipf?
"... We explore the implications of Zipf’s law for the understanding of linguistic productivity. Focusing on language acquisition, we show that the item/usage based approach has not been supported by adequate statistical evidence. By contrast, the quantitative properties of a productive grammar can be pr ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We explore the implications of Zipf’s law for the understanding of linguistic productivity. Focusing on language acquisition, we show that the item/usage based approach has not been supported by adequate statistical evidence. By contrast, the quantitative properties of a productive grammar can be precisely formulated, and are consistent with even very young children’s language. Moreover, drawing from research in computational linguistics, the statistical properties of natural language strongly suggest that the theory of grammar be composed of general principles with overarching range of applications rather than a collection of item and construction specific expressions. 2 1
Revisiting frequency and storage in morphological processing *
"... The balance between storage and computation of complex words is a major point of departure both for theories of lexical representation (e.g., Goldberg 2006, Halle & Marantz 1993, Jackendoff 1975) and processing (e.g., Baayen et al. 1997, Butterworth 1983, Taft 2004). The atoms of lexical memory ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The balance between storage and computation of complex words is a major point of departure both for theories of lexical representation (e.g., Goldberg 2006, Halle & Marantz 1993, Jackendoff 1975) and processing (e.g., Baayen et al. 1997, Butterworth 1983, Taft 2004). The atoms of lexical memory that are implicated in lexical
A Statistical Test for Grammar
"... We propose a statistical test for measuring grammatical productivity. We show that very young children’s knowledge is consistent with a systematic grammar that independently combines linguistic units. To a testable extent, the usage-based approach to language and language learning, which emphasizes ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
We propose a statistical test for measuring grammatical productivity. We show that very young children’s knowledge is consistent with a systematic grammar that independently combines linguistic units. To a testable extent, the usage-based approach to language and language learning, which emphasizes the role of lexically specific memorization, is inconsistent with the child language data. We also discuss the connection of this research with developments in computational and theoretical linguistics. 1
Investigating the Relationship Between Linguistic Representation and Computation through an Unsupervised Model of Human Morphology Learning∗
"... We develop an unsupervised algorithm for morphological acquisition to investigate the relationship be-tween linguistic representation, data statistics, and learning algorithms. We model the phenomenon that children acquire the morphological inflections of a language monotonically by introducing an a ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We develop an unsupervised algorithm for morphological acquisition to investigate the relationship be-tween linguistic representation, data statistics, and learning algorithms. We model the phenomenon that children acquire the morphological inflections of a language monotonically by introducing an al-gorithm that uses a bootstrapped, frequency-driven learning procedure to acquire rules monotonically. The algorithm learns a morphological grammar in terms of a Base and Transforms representation, a simple rule-based model of morphology. When tested on corpora of child-directed speech in English from CHILDES (MacWhinney, 2000), the algorithm learns the most salient rules of English morphology and the order of acquisition is similar to that of children as observed by Brown (1973). Investigations of statistical distributions in corpora reveal that the algorithm is able to acquire morphological grammars due to its exploitation of Zipfian distributions in morphology through type-frequency statistics. These in-vestigations suggest that the computation and frequency-driven selection of discrete morphological rules may be important factors in children’s acquisition of basic inflectional morphological systems.
Unsupervised Syntactic Category Learning from Child-Directed Speech
, 2010
"... in whole or in part. ..."
(Show Context)
Long-tail Distributions and Unsupervised Learning of Morphology
"... In previous work on unsupervised learning of morphology, the long-tail pattern in the rank-frequency distribution of words, as well as of morphological units, is usually considered as following Zipf’s law (power-law). We argue that these long-tail distributions can also be considered as lognormal. S ..."
Abstract
- Add to MetaCart
(Show Context)
In previous work on unsupervised learning of morphology, the long-tail pattern in the rank-frequency distribution of words, as well as of morphological units, is usually considered as following Zipf’s law (power-law). We argue that these long-tail distributions can also be considered as lognormal. Since we know the conjugate prior distribution for a lognormal likelihood, we propose to generate morphology data from lognormal distributions. When the performance is evaluated by a tokenbased criterion, giving more weights to the results of frequent words, the proposed model preforms significantly better than other models in discussion. Moreover, we capture the statistical properties of morphological units with a Bayesian approach, other than a rule-based approach as studied in (Chan, 2008) and (Zhao and Marcus, 2011). Given the multiplicative property of lognormal distributions, we can directly capture the long-tail distribution of word frequency, without the need of an additional generative process as studied in (Goldwater et al., 2006).
Processing General Terms
"... We use the Base and Transforms Model proposed by Chan [1] as the core of a morphological analyzer, extending its concept of base-derived relationships to allow multi-step derivations and adding a number of features required for robustness on larger corpora. The result is a rule-based morphological a ..."
Abstract
- Add to MetaCart
(Show Context)
We use the Base and Transforms Model proposed by Chan [1] as the core of a morphological analyzer, extending its concept of base-derived relationships to allow multi-step derivations and adding a number of features required for robustness on larger corpora. The result is a rule-based morphological analyzer, attaining an F-score of 58.48 % in English and 33.61 % in German in the Morphochallenge 2009 Competition 1 evaluation.
Evidence for a Morphological Acquisition Model from Development Data
"... Work in morphology learning has thus far been primarily divided into two lines of research: cognitively-motivated models of morphology learning, which attempt to model human development and competency, and engineering-oriented models, which attempt to maximize application performance. In this paper ..."
Abstract
- Add to MetaCart
(Show Context)
Work in morphology learning has thus far been primarily divided into two lines of research: cognitively-motivated models of morphology learning, which attempt to model human development and competency, and engineering-oriented models, which attempt to maximize application performance. In this paper we