• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

A survey of statistical machine translation (2007)

Cached

  • Download as a PDF

Download Links

  • [umiacs.umd.edu]
  • [www.cs.umd.edu]
  • [homepages.inf.ed.ac.uk]
  • [www.cs.jhu.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Adam Lopez
Citations:30 - 3 self
  • Summary
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@TECHREPORT{Lopez07asurvey,
    author = {Adam Lopez},
    title = { A survey of statistical machine translation},
    institution = {},
    year = {2007}
}

Years of Citing Articles

Bookmark

citeulike Connotea Bibsonomy Del.icio.us Digg Reddit

OpenURL

 

Abstract

Statistical machine translation (SMT) treats the translation of natural language as a machine learning problem. By examining many samples of human-produced translation, SMT algorithms automatically learn how to translate. SMT has made tremendous strides in less than two decades, and many popular techniques have only emerged within the last few years. This survey presents a tutorial overview of state-of-the-art SMT at the beginning of 2007. We begin with the context of the current research, and then move to a formal problem description and an overview of the four main subproblems: translational equivalence modeling, mathematical modeling, parameter estimation, and decoding. Along the way, we present a taxonomy of some different approaches within these areas. We conclude with an overview of evaluation and notes on future directions.

Citations

6231 Maximum likelihood from incomplete data via the EM algorithm - Dempster, Laird, et al. - 1977
3376 Introduction to Automata Theory Languages and Computation. 2nd edition. Addison-Wesley Publishing Company, 2000. de - Hopcroft, Ullman - 2004
2959 Artificial Intelligence: a Modern Approach - Russell, Norvig - 1995
1654 B.: Building a large annotated corpus of english: The Penn treebank - Marcus, Marcinkiewicz, et al. - 1993
992 BLEU: A Method for Automatic Evaluation of Machine Translation - Papineni, Roukos, et al. - 2002
892 Mathematics of Statistical Machine Translation: Parameter Estimation - Brown, Pietra, et al. - 1993
874 Error bounds for convolutional codes and an asymptotically optimum decoding algorithm - Viterbi - 1967
846 A Maximum Entropy Approach to Natural Language Processing - Berger, Pietra, et al. - 1996
805 A systematic comparison of various statistical alignment models - Och, Ney - 2003
612 Statistical Methods for Speech Recognition - Jelinek - 1997
549 Tree-adjoining grammars - Joshi, Schabes - 1997
540 Class-based n-gram models of natural language - Brown, Pietra, et al. - 1992
502 A statistical approach to machine translation - Brown, Cocke, et al. - 1990
434 Moses: Open source toolkit for statistical machine translation - Koehn, Hoang, et al. - 2007
422 An Inequality and Associated Maximization Technique in Statistical Estimation of a Markov Process - Baum - 1972
417 Statistical phrase-based translation - Koehn, Och, et al. - 2003
355 Generalized Iterative Scaling for Log-Linear Models - Darroch, Ratcliff - 1972
349 A program for aligning sentences in bilingual corpora - Gale, Church - 1993
343 Stochastic inversion transduction grammars and bilingual parsing of parallel corpora - Wu - 1997
341 Improved statistical alignment models - Och, Ney - 2000
316 The estimation of stochastic context-free grammars using the inside-outside algorithm. Computer Speech and Language - Lari, Young - 1990
282 Lattice-based minimum error rate training for statistical machine translation - Macherey, Och, et al. - 2008
257 A hierarchical phrase-based model for statistical machine translation - Chiang - 2005
256 Discriminative training and maximum entropy models for statistical machine translation - Och, Ney
216 Automatic Evaluation of Machine Translation Quality using N-gram Co-occurrence Statistics - Doddington - 2010
212 Tagging English text with probabilistic model - Merialdo - 1994
209 Hierarchical phrase-based translation - Chiang - 2007
205 Improved alignment models for statistical machine translation - Och, Tillmann, et al. - 1999
204 A Study of Translation Edit Rate with Targeted Human Annotation - Snover, Dorr, et al. - 2006
202 Speech and Language Processing – An Introduction to - Jurafsky, Martin - 2009
202 K.: A syntax-based statistical translation model - Yamada, Knight - 2001
167 Maximum Entropy Models for natural language ambiguity resolution - Ratnaparkhi - 1993
162 D.: What’s in a translation rule - Galley, Hopkins, et al. - 2004
158 Europarl: A parallel corpus for statistical machine translation - Koehn - 2005
140 Identifying word correspondences in parallel texts - Gale, Church - 1991
136 Scalable inference and training of context-rich syntactic translation models - Galley, Graehl, et al. - 2006
135 A phrase-based, joint probability model for statistical machine translation - Marcu, Wong - 2002
122 The convergence of mildly context-sensitive grammatical formalisms - Joshi, Vijay-Shanker, et al. - 1991
121 Models of translational equivalence among words - Melamed
119 Two decades of statistical language modeling: Where do we go from here - Rosenfeld
106 Exploiting syntactic structure for language modeling - Chelba, Jelinek - 1998
103 Better k-best parsing - Huang, Chiang - 2005
102 Statistical Significance Tests for Machine Translation Evaluation - Koehn - 2004
102 Dependency treelet translation: Syntactically informed phrasal smt - Quirk, Menezes, et al. - 2005
101 The web as a parallel corpus - Resnik, Smith - 2003
100 A Statistical Parser for Czech - Collins, Ramshaw, et al. - 1999
97 Paraphrasing with bilingual parallel corpora - Bannard, Callison-Burch - 2005
95 Alignment by Agreement - Liang, Taskar, et al. - 2006
94 Manning and Hinrich Schutze. Foundation of Statistical Language Processing - Christopher - 1999
93 A smorgasbord of features for statistical machine translation - Och, Gildea, et al. - 2004
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University