• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Structure and Distributions in Morphology Learning. Doctoral dissertation (2008)

by E Chan
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 14
Next 10 →

A Rule-based Acquisition Model Adapted for Morphological Analysis ⋆

by Constantine Lignos, Erwin Chan, Mitchell P. Marcus, Charles Yang
"... Abstract. We adapt the cognitively-oriented morphology acquisition model proposed in (Chan 2008) to perform morphological analysis, extending its concept of base-derived relationships to allow multi-step derivations and adding features required for robustness on noisy corpora. This results in a rule ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
Abstract. We adapt the cognitively-oriented morphology acquisition model proposed in (Chan 2008) to perform morphological analysis, extending its concept of base-derived relationships to allow multi-step derivations and adding features required for robustness on noisy corpora. This results in a rule-based morphological analyzer which attains an F-score of 58.48 % in English and 33.61 % in German in the Morpho Challenge 2009 Competition 1 evaluation. The learner’s performance shows that acquisition models can effectively be used in text-processing tasks traditionally dominated by statistical approaches. 1
(Show Context)

Citation Context

...nos@cis.upenn.edu, mitch@cis.upenn.edu, charles.yang@ling.upenn.edu 2 University of Arizona, echan3@u.arizona.edu Abstract. We adapt the cognitively-oriented morphology acquisition model proposed in (=-=Chan 2008-=-) to perform morphological analysis, extending its concept of base-derived relationships to allow multi-step derivations and adding features required for robustness on noisy corpora. This results in a...

Who’s Afraid of George Kingsley Zipf?

by Charles Yang
"... We explore the implications of Zipf’s law for the understanding of linguistic productivity. Focusing on language acquisition, we show that the item/usage based approach has not been supported by adequate statistical evidence. By contrast, the quantitative properties of a productive grammar can be pr ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
We explore the implications of Zipf’s law for the understanding of linguistic productivity. Focusing on language acquisition, we show that the item/usage based approach has not been supported by adequate statistical evidence. By contrast, the quantitative properties of a productive grammar can be precisely formulated, and are consistent with even very young children’s language. Moreover, drawing from research in computational linguistics, the statistical properties of natural language strongly suggest that the theory of grammar be composed of general principles with overarching range of applications rather than a collection of item and construction specific expressions. 2 1

Revisiting frequency and storage in morphological processing *

by Constantine Lignos, Kyle Gorman
"... The balance between storage and computation of complex words is a major point of departure both for theories of lexical representation (e.g., Goldberg 2006, Halle & Marantz 1993, Jackendoff 1975) and processing (e.g., Baayen et al. 1997, Butterworth 1983, Taft 2004). The atoms of lexical memory ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
The balance between storage and computation of complex words is a major point of departure both for theories of lexical representation (e.g., Goldberg 2006, Halle & Marantz 1993, Jackendoff 1975) and processing (e.g., Baayen et al. 1997, Butterworth 1983, Taft 2004). The atoms of lexical memory that are implicated in lexical

A Statistical Test for Grammar

by Charles Yang
"... We propose a statistical test for measuring grammatical productivity. We show that very young children’s knowledge is consistent with a systematic grammar that independently combines linguistic units. To a testable extent, the usage-based approach to language and language learning, which emphasizes ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
We propose a statistical test for measuring grammatical productivity. We show that very young children’s knowledge is consistent with a systematic grammar that independently combines linguistic units. To a testable extent, the usage-based approach to language and language learning, which emphasizes the role of lexically specific memorization, is inconsistent with the child language data. We also discuss the connection of this research with developments in computational and theoretical linguistics. 1
(Show Context)

Citation Context

...aw distributions as well (Teahna 1997, Ha et al. 2002), which contributes to the familiar sparse data problem in computational linguistics. These observations generalize the combination of morphemes (=-=Chan 2008-=-) and grammatical rules. Figure 1 plots the ranks and frequencies syntactic rules (on log-log scale) from the Penn Treebank (Marcus et al. 1993); certain rules headed by specific functional words have...

Investigating the Relationship Between Linguistic Representation and Computation through an Unsupervised Model of Human Morphology Learning∗

by Erwin Chan, Constantine Lignos
"... We develop an unsupervised algorithm for morphological acquisition to investigate the relationship be-tween linguistic representation, data statistics, and learning algorithms. We model the phenomenon that children acquire the morphological inflections of a language monotonically by introducing an a ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
We develop an unsupervised algorithm for morphological acquisition to investigate the relationship be-tween linguistic representation, data statistics, and learning algorithms. We model the phenomenon that children acquire the morphological inflections of a language monotonically by introducing an al-gorithm that uses a bootstrapped, frequency-driven learning procedure to acquire rules monotonically. The algorithm learns a morphological grammar in terms of a Base and Transforms representation, a simple rule-based model of morphology. When tested on corpora of child-directed speech in English from CHILDES (MacWhinney, 2000), the algorithm learns the most salient rules of English morphology and the order of acquisition is similar to that of children as observed by Brown (1973). Investigations of statistical distributions in corpora reveal that the algorithm is able to acquire morphological grammars due to its exploitation of Zipfian distributions in morphology through type-frequency statistics. These in-vestigations suggest that the computation and frequency-driven selection of discrete morphological rules may be important factors in children’s acquisition of basic inflectional morphological systems.

Unsupervised Syntactic Category Learning from Child-Directed Speech

by Olga N. Wichrowska, Robert C. Berwick, Olga N. Wichrowska , 2010
"... in whole or in part. ..."
Abstract - Add to MetaCart
in whole or in part.
(Show Context)

Citation Context

...words belonging to closed classes, such as conjunctions, prepositions, and pronouns, among others). This is drastically smaller than any of the datasets used in previous work (Redington et al., 1998; =-=Chan, 2008-=-; Schiitze, 1995). My goal in these experiments was to see how well each clustering algorithm could do on a small but relatively structured set. 3.1.1 Removed Utterances The first 30 files in the 'Ada...

The acquisition of verbal . . .

by Dominik Rus , 2008
"... ..."
Abstract - Add to MetaCart
Abstract not found

Long-tail Distributions and Unsupervised Learning of Morphology

by unknown authors
"... In previous work on unsupervised learning of morphology, the long-tail pattern in the rank-frequency distribution of words, as well as of morphological units, is usually considered as following Zipf’s law (power-law). We argue that these long-tail distributions can also be considered as lognormal. S ..."
Abstract - Add to MetaCart
In previous work on unsupervised learning of morphology, the long-tail pattern in the rank-frequency distribution of words, as well as of morphological units, is usually considered as following Zipf’s law (power-law). We argue that these long-tail distributions can also be considered as lognormal. Since we know the conjugate prior distribution for a lognormal likelihood, we propose to generate morphology data from lognormal distributions. When the performance is evaluated by a tokenbased criterion, giving more weights to the results of frequent words, the proposed model preforms significantly better than other models in discussion. Moreover, we capture the statistical properties of morphological units with a Bayesian approach, other than a rule-based approach as studied in (Chan, 2008) and (Zhao and Marcus, 2011). Given the multiplicative property of lognormal distributions, we can directly capture the long-tail distribution of word frequency, without the need of an additional generative process as studied in (Goldwater et al., 2006).
(Show Context)

Citation Context

... significantly better than other models in discussion. Moreover, we capture the statistical properties of morphological units with a Bayesian approach, other than a rule-based approach as studied in (=-=Chan, 2008-=-) and (Zhao and Marcus, 2011). Given the multiplicative property of lognormal distributions, we can directly capture the long-tail distribution of word frequency, without the need of an additional gen...

Processing General Terms

by Constantine Lignos, Erwin Chan, Mitchell P. Marcus, Charles Yang
"... We use the Base and Transforms Model proposed by Chan [1] as the core of a morphological analyzer, extending its concept of base-derived relationships to allow multi-step derivations and adding a number of features required for robustness on larger corpora. The result is a rule-based morphological a ..."
Abstract - Add to MetaCart
We use the Base and Transforms Model proposed by Chan [1] as the core of a morphological analyzer, extending its concept of base-derived relationships to allow multi-step derivations and adding a number of features required for robustness on larger corpora. The result is a rule-based morphological analyzer, attaining an F-score of 58.48 % in English and 33.61 % in German in the Morphochallenge 2009 Competition 1 evaluation.
(Show Context)

Citation Context

...f Pennsylvania, ‡ University of Arizona lignos@cis.upenn.edu, echan3@email.arizona.edu, mitch@cis.upenn.edu, charles.yang@ling.upenn.edu Abstract We use the Base and Transforms Model proposed by Chan =-=[1]-=- as the core of a morphological analyzer, extending its concept of base-derived relationships to allow multi-step derivations and adding a number of features required for robustness on larger corpora....

Evidence for a Morphological Acquisition Model from Development Data

by Constantine Lignos, Erwin Chan, Charles Yang, Mitchell P. Marcus
"... Work in morphology learning has thus far been primarily divided into two lines of research: cognitively-motivated models of morphology learning, which attempt to model human development and competency, and engineering-oriented models, which attempt to maximize application performance. In this paper ..."
Abstract - Add to MetaCart
Work in morphology learning has thus far been primarily divided into two lines of research: cognitively-motivated models of morphology learning, which attempt to model human development and competency, and engineering-oriented models, which attempt to maximize application performance. In this paper we
(Show Context)

Citation Context

...ring-oriented models, which attempt to maximize application performance. In this paper we address the gap between these approaches by presenting results from applying the learning model presented in (=-=Chan, 2008-=-) to child-directed data and comparing its learning process to research in child language acquisition. The first prominent computational model to address the learning of morphology in a manner aligned...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University