MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Ecole de Traduction et Interpr'etation,

Download:
Download as a PDF | Download as a PS
by Alexander Clark
http://www.cogs.susx.ac.uk/users/alexc/phmm-draft.ps
Add To MetaCart

Abstract:

In this paper I present a novel Machine Learning technique based on Pair Hidden Markov Models, a statistical model used in bioinformatics. This technique can be used to learn finite-state string to string transductions. I present a model of the acquisition of the English past tense. The same model can also learn without modification the Arabic broken plural, a much more complex morphological system. I also show how this model can be used for unsupervised learning of morphology, and in fact can learn morphology from sets of words automatically induced from unlabelled corpora. I then discuss various other applications and extensions of this technique. 1.

Citations

4344 Maximum likelihood from incomplete data via the EM algorithm – Dempster, Laird, et al. - 1977
371 Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids – Dubin, Eddy, et al. - 1998
236 On learning the past tenses of English verbs – Rumelhart, McClelland - 1986
235 Inductive inference of formal languages from positive data – Angluin - 1980
198 Finite-State Transducers in Language and Speech Processing – Mohri - 1997
196 Convolution Kernels on Discrete Structures – Haussler - 1999
193 Statistical inference for probabilistic functions of finite state Markov chains – Baum, Petrie - 1966
165 On language and connectionism: Analysis of a parallel distributed processing model of language acquisition – Pinker, Prince - 1988
151 Stochastic inversion transduction grammars and bilingual parsing of parallel corpora – Wu - 1997
115 Unsupervised learning of the morphology of a natural language – Goldsmith - 2001
112 Learning string-edit distance – Ristad, Yianilos - 1998
105 Dynamic alignment kernels – Watkins - 1999
62 Induction of first-order decision lists : Results on learning the past tense of english verbs – Mooney, Califf - 1995
57 An inequality for rational functions with applications to some statistical estimation problems – Gopalakrishnan - 1991
39 The BNC handbook: Exploring the British National Corpus with – Aston, Burnard - 1998
37 Learning in natural language – Roth - 1999
34 Learning the past tense of English verbs: The symbolic pattern associator vs. connectionist models – Ling - 1994
34 A natural law of succession – Ristad - 1995
33 Knowledge-free induction of morphology using latent semantic analysis – Schone, Jurafsky - 2000
30 Algorithms for grapheme-phoneme translation for English and French: Applications for database searches and speech synthesis – Divay, Vitale - 1997
25 Syntax directed translations and the pushdown assembler – Aho, Ullman - 1969
24 Foot and word in prosodic morphology: The Arabic broken plural – McCarthy, Prince - 1990
23 Inducing syntactic categories by context distribution clustering – Clark - 2000
22 Maximum mutual information estimation of hidden Markov models”, chapter 3 in Automatic Speech and Speaker Recognition, Advanced Topics, edited by Chin-Hui – Normandin - 1996
16 Learning bias and phonological-rule induction – Gildea, Jurafsky - 1996
15 Finitestate non-concatenative morphotactics – Beesley, Karttunen - 2000
13 Statistical Methods for Speech Recognition. Language, speech and communication – Jelinek - 1997
12 Properties of syntax directed translations – Aho, Ullman - 1969
10 Neural networks, nativism and the plausibility of constructivism – Quartz - 1993
8 On the use of sequential transducers in natural language processing – Mohri - 1997
8 Analogical prediction – Muggleton, Bain - 1999
3 Concepts et algorithmes pour la découverte des structures formelles des langues – Déjean - 1998
3 A morphology component for language programs – Golding, Thompson - 1985
2 Computational complexity of problems on probabilistic grammars and transducers – Higuera, C - 2000
2 A connectionist model of the Arabic plural system – Nakisa - 1997
1 Inductive logic programming: issues, results and the lll challenge – Muggelton - 1999
1 Efficient mulit-lingual phoneme-tographeme conversion based on HMM – Rentzepopoulos - 1996