Results 1  10
of
18
A maximum entropy model of phonotactics and phonotactic learning
, 2006
"... The study of phonotactics (e.g., the ability of English speakers to distinguish possible words like blick from impossible words like *bnick) is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our ..."
Abstract

Cited by 132 (15 self)
 Add to MetaCart
The study of phonotactics (e.g., the ability of English speakers to distinguish possible words like blick from impossible words like *bnick) is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle of maximum entropy. Possible words are assessed by these grammars based on the weighted sum of their constraint violations. The learning algorithm yields grammars that can capture both categorical and gradient phonotactic patterns. The algorithm is not provided with any constraints in advance, but uses its own resources to form constraints and weight them. A baseline model, in which Universal Grammar is reduced to a feature set and an SPEstyle constraint format, suffices to learn many phonotactic phenomena. In order to learn nonlocal phenomena such as stress and vowel harmony, it is necessary to augment the model with autosegmental tiers and metrical grids. Our results thus offer novel, learningtheoretic support for such representations. We apply the model to English syllable onsets, Shona vowel harmony, quantityinsensitive stress typology, and the full phonotactics of Wargamay, showing that the learned grammars capture the distributional generalizations of these languages and accurately predict the findings of a phonotactic experiment.
Optimal interleaving: serial phonologymorphology interaction in a constraintbased model
, 2008
"... ..."
Convergence properties of a gradual learning algorithm for Harmonic Grammar. Rutgers Optimality Archive 970
, 2008
"... Abstract. This paper investigates a gradual online learning algorithm for Harmonic Grammar. By adapting existing convergence proofs for perceptrons, we show that for any nonvarying target language, HarmonicGrammar learners are guaranteed to converge to an appropriate grammar, if they receive compl ..."
Abstract

Cited by 39 (14 self)
 Add to MetaCart
Abstract. This paper investigates a gradual online learning algorithm for Harmonic Grammar. By adapting existing convergence proofs for perceptrons, we show that for any nonvarying target language, HarmonicGrammar learners are guaranteed to converge to an appropriate grammar, if they receive complete information about the structure of the learning data. We also prove convergence when the learner incorporates evaluation noise, as in Stochastic Optimality Theory. Computational tests of the algorithm show that it converges quickly. When learners receive incomplete information (e.g. some structure remains hidden), tests indicate that the algorithm is more likely to converge than two comparable OptimalityTheoretic learning algorithms.
Natural and Unnatural Constraints in Hungarian Vowel Harmony
 TO APPEAR IN LANGUAGE
, 2009
"... Phonological constraints can, in principle, be classified according to whether they are natural (founded in principles of Universal Grammar (UG)) or unnatural (arbitrary, learned inductively from the language data). Recent work has used this distinction as the basis for arguments about the role of ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
Phonological constraints can, in principle, be classified according to whether they are natural (founded in principles of Universal Grammar (UG)) or unnatural (arbitrary, learned inductively from the language data). Recent work has used this distinction as the basis for arguments about the role of UG in learning. Some languages have phonological patterns that arguably reflect unnatural constraints. With experimental testing, one can assess whether such patterns are actually learned by native speakers. Becker, Ketrez, and Nevins (2007), testing speakers of Turkish, suggest that they do indeed go unlearned. They interpret this result with a strong UG position: humans are unable to learn data patterns not backed by UG principles. This article pursues the same research line, locating similarly unnatural data patterns in the vowel harmony system of Hungarian, such as the tendency (among certain stem types) for a final bilabial stop to favor front harmony. Our own test leads to the opposite conclusion to Becker et al.: Hungarians evidently do learn the unnatural patterns. To conclude we consider a bias account—that speakers are able to learn unnatural environments, but devalue them relative to natural ones. We outline a method for testing the strength of constraints as learned by speakers against the strength of the corresponding patterns in the lexicon, and show that it offers tentative support for the hypothesis that unnatural constraints are disfavored by language learners.
GRAMMATICALITY AND UNGRAMMATICALITY IN PHONOLOGY
, 2008
"... In this paper, I make two theoretical claims: (i) For some form to be grammatical in language L, it is not necessary that the form satisfy all constraints that are active in L, i.e. even grammatical forms can violate constraints. (ii) There are degrees of ungrammaticality, i.e. not all ungrammatic ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
In this paper, I make two theoretical claims: (i) For some form to be grammatical in language L, it is not necessary that the form satisfy all constraints that are active in L, i.e. even grammatical forms can violate constraints. (ii) There are degrees of ungrammaticality, i.e. not all ungrammatical forms are equally ungrammatical. I first show that these claims follow straightforwardly from the basic architecture of an Optimality Theoretic grammar. I then show that the surface sound patterns used most widely in formal phonology cannot be used to test the truth of these two claims, but argue that results from speech processing experiments can. Finally, I discuss three experiments on the processing of nonwords of the form [stVt], [skVk] and [spVp] in English that were designed to test these claims, and show that both claims are confirmed by the results of the experiments.
A maximum entropy model of phonotactics and phonotactic learning
, 2008
"... The study of phonotactics is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle of maximum entropy. ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
The study of phonotactics is a central topic in phonology. We propose a theory of phonotactic grammars and a learning algorithm that constructs such grammars from positive evidence. Our grammars consist of constraints that are assigned numerical weights according to the principle of maximum entropy. The grammars assess possible words on the basis of the weighted sum of their constraint violations. The learning algorithm yields grammars that can capture both categorical and gradient phonotactic patterns. The algorithm is not provided with constraints in advance, but uses its own resources to form constraints and weight them. A baseline model, in which Universal Grammar is reduced to a feature set and an SPEstyle constraint format, suffices to learn many phonotactic phenomena. In order for the model to learn nonlocal phenomena such as stress and vowel harmony, it must be augmented with autosegmental tiers and metrical grids. Our results thus offer novel, learningtheoretic support for such representations. We apply the model in a variety of learning simulations, showing that the learned grammars capture the distributional generalizations of these languages and accurately predict the findings of a phonotactic experiment.
Modeling doubly marked lags with a split additive model
 Enkeleida Kapia (eds.) BUCLD 32: Proceedings of the 32nd Annual Boston University Conference on Language Development, 36–47
, 2008
"... In child phonology marked structures are frequently acquired first in relatively unmarked contexts. Four examples of this can be observed in the acquisition of Dutch syllable structure as reported by Levelt et al. (2000), a representative example of which is given in (1): onsetless syllables must in ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
In child phonology marked structures are frequently acquired first in relatively unmarked contexts. Four examples of this can be observed in the acquisition of Dutch syllable structure as reported by Levelt et al. (2000), a representative example of which is given in (1): onsetless syllables must initally be open (stage
The VC dimension of constraintbased grammars
, 2009
"... We analyze the complexity of Harmonic Grammar (HG), a linguistic model in which licit underlyingtosurfaceform mappings are determined by optimization over weighted constraints. We show that the VapnikChervonenkis Dimension of HG grammars with k constraints is k − 1. This establishes a fundamental ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
We analyze the complexity of Harmonic Grammar (HG), a linguistic model in which licit underlyingtosurfaceform mappings are determined by optimization over weighted constraints. We show that the VapnikChervonenkis Dimension of HG grammars with k constraints is k − 1. This establishes a fundamental bound on the complexity of HG in terms of its capacity to classify sets of linguistic data that has significant ramifications for learnability. The VC dimension of HG is the same as that of Optimality Theory (OT), which is similar to HG, but uses ranked rather than weighted constraints in optimization. The parity of the VC dimension in these two models is somewhat surprising because OT defines finite classes of grammars—there are at most k! ways to rank k constraints—while HG can define infinite classes of grammars because the weights associated with constraints are realvalued. The parity is also surprising because HG permits groups of constraints that interact through socalled ‘gang effects’ to generate languages that cannot be generated in OT. The fact that the VC dimension grows linearly with the number of constraints in both models means that, even in the worst case, the number of randomly chosen training samples needed to weight/rank a known set of constrains is a linear function of k. We conclude that though there may be factors that favor one model or the other, the complexity of learning weightings/rankings is not one of them.
Harmonic Grammar, Gradual Learning, and Phonological Gradience
, 2007
"... In proposing constraint ranking Prince and Smolensky (1993/2004) depart from OT’s predecessor Harmonic Grammar (HG; Smolensky and Legendre 2006). In HG optimality is defined numerically as maximal Harmony, which is calculated as the sum of a candidate’s weighted constraint scores. OT’s use of rankin ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In proposing constraint ranking Prince and Smolensky (1993/2004) depart from OT’s predecessor Harmonic Grammar (HG; Smolensky and Legendre 2006). In HG optimality is defined numerically as maximal Harmony, which is calculated as the sum of a candidate’s weighted constraint scores. OT’s use of ranking rather than weighting is sometimes justified in terms of its restrictiveness: weighted interaction is claimed to be too powerful for HG to function as a realistic model of human language (Prince and Smolensky 1993/2004: 236; 1997: 1608, Legendre et al. 2006b). In this talk, I present results of several ongoing collaborative research projects on HG. I will discuss the following points: (1) i. HG is (perhaps surprisingly) restrictive, due to inherent limitations on the types of languages that can be generated by an optimization system (Bhatt et al. 2007; Pater et al. 2007) ii. HG is compatible with a simple correctly convergent gradual learning algorithm, the Perceptron algorithm of Rosenblatt (1958) (Boersma and Pater 2007; Pater 2007a; see Jäger 2006, Soderstrom et al. 2006 for precedents). iii. To deal with variation, HG can be implemented with noise, as in stochastic OT (Boersma 1998; Boersma and Hayes 2001). Testing shows that noisy HG+Perceptron is robust, unlike stochastic OT+GLA (Boersma and Pater 2007). iv. Gradual learning yields Harmony values that reflect frequency distributions. A problem for the HG account of gradient wellformedness (Keller 2006; Legendre et al. 2006a) raised by Boersma (2004) can be resolved with a revised HG acceptability metric (Coetzee and Pater 2007). 1 Weighted optimization Given a linguistic representation R, its scores on a set of constraints ({C1(R), C2(R), C3(R),...Cn(R)}), and a set of coefficients, or weights for the constraints ({W1, W2, W3,...Wn}), HG’s Harmony function returns the sum of the weighted constraint scores. (2) H(R) = (C1(R)*W1) + (C2(R)*W2) + (C3(R)*W3) +... (Cn*Wn) As Prince and Smolensky (1993/2004: 236) point out, optimality in an OTlike theory of generative grammar
ASSIMILATION AS ATTRACTION: COMPUTING DISTANCE, SIMILARITY, AND LOCALITY IN PHONOLOGY
, 2009
"... This dissertation explores similarity effects in assimilation, proposing an Attraction Framework to analyze cases of parasitic harmony where a triggertarget pair only results in harmony if the trigger and target agree on other features. Attraction provides a natural model of these effects by relati ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
This dissertation explores similarity effects in assimilation, proposing an Attraction Framework to analyze cases of parasitic harmony where a triggertarget pair only results in harmony if the trigger and target agree on other features. Attraction provides a natural model of these effects by relating the pressure for assimilation to the representational distance between segments: the more similar a triggertarget pair, the stronger the attraction force between them. Attraction grammars in Optimality Theory (OT; Prince & Smolensky, 2004) are rigorously compared to those of Harmonic Grammar (HG; Legendre, Miyata, and Smolensky, 1990). A condition for equality of attraction in OT and HG converges with empirical considerations by prohibiting unattested patterns of disjunctive parasitic harmony. Another goal of this work is to investigate how similarity preconditions interact with the locality effects common to harmony. Longdistance consonant harmony, blocking and transparency in vowel harmony, and strictly local assimilation receive a unified explanation in the Attraction Framework by hypothesizing that like features, locality can contribute to a general notion of similarity. A positional similarity