• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Cached

  • Download as a PDF

Download Links

  • [www.aclweb.org]
  • [www.aclweb.org]
  • [aclweb.org]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Unknown Authors
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@MISC{_,
    author = {},
    title = {},
    year = {}
}

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

Unsupervised word segmentation for Sesotho using Adaptor Grammars This paper describes a variety of nonparametric Bayesian models of word segmentation based on Adaptor Grammars that model different aspects of the input and incorporate different kinds of prior knowledge, and applies them to the Bantu language Sesotho. While we find overall word segmentation accuracies lower than these models achieve on English, we also find some interesting differences in which factors contribute to better word segmentation. Specifically, we found little improvement to word segmentation accuracy when we modeled contextual dependencies, while modeling morphological structure did improve segmentation accuracy. 1

Citations

588 Monte Carlo statistical methods - Robert, Casella - 2004
284 Knowledge of Language, Its Nature, Origin, and Use - Chomsky - 1986
129 Beyond Grammar: An Experience-Based Theory of Language. Center for the Study of Language and Inf - Bod - 1998
103 An efficient, probabilistically sound algorithm for segmentation and word discovery - Brent - 1999
22 The phonology of parent child speech - Bernstein-Ratner - 1987
19 Nonparametric Bayesian Models of Lexical Acquisition - Goldwater - 2006
6 Acquisition of Sesotho - Demuth - 1992
5 Contextual dependencies in unsupervised word segmentation - 2006a
4 Interpolating between types and tokens by estimating power-law generators - 2006b
2 Refining the SED heuristic for morpheme discovery: Another look at Swahili - Hu, Matveeva, et al. - 2005
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University