MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  A word-to-word model of translational equivalence (1997) [63 citations — 6 self]

Download:
Download as a PDF | Download as a PS
by I. Dan Melamed
ftp://ftp.cis.upenn.edu/pub/melamed/papers/transmods.ps.gz
Add To MetaCart

Abstract:

Parallel texts (bitexts) have properties that distinguish them from other kinds of parallel data. First, most words translate to only one other word. Second, bitext correspondence is noisy. This article presents methods for biasing statistical translation models to reflect these properties. Analysis of the expected behavior of these biases in the presence of sparse data predicts that they will result in more accurate models. The prediction is confirmed by evaluation with respect to a gold standard--- translation models that are biased in this fashion are significantly more accurate than a baseline knowledge-poor model. This article also shows how a statistical translation model can take advantage of various kinds of pre-existing knowledge that might be available about particular language pairs. Even the simplest kinds of language-specific knowledge, such as the distinction between content words and function words, is shown to reliably boost translation model performance on some tasks. Statistical translation models that are informed by pre-existing knowledge about the model domain combine the best of both the rationalist and empiricist traditions.

Citations

4704 Maximum likelihood from incomplete data via the EM algorithm – Dempster, Laird, et al. - 1977
516 The mathematics of statistical machine translation: parameter estimation – Brown, Pietra, et al. - 1993
501 Accurate methods for the statistics of surprise and coincidence – Dunning - 1993
430 A statistical approach to machine translation – unknown authors - 1990
277 A program for aligning sentences in bilingual corpora – Gale, Church - 1991
127 Identifying word correspondences in parallel texts – Gale, Church - 1991
118 One sense per collocation – Yarowsky - 1993
91 Hmm-based word alignment in statistical translation – Vogel, Ney, et al. - 1996
61 A perspective on word sense disambiguation methods and their evaluation – Resnik, Yarowsky - 1997
60 Automatic Evaluation and Uniform Filter Cascades for Inducing N-Best Translation Lexicons – Melamed - 1995
56 A statistical approach to language translation – Brown - 1988
55 Automatic construction of clean broad-coverage translation lexicons – Melamed - 1996
53 A Survey of Multilingual Text Retrieval – Oard, Dorr - 1996
52 Building Probabilistic Models for Natural Language – Chen - 1996
46 A pattern matching method for finding noun and proper noun translations from noisy parallel corpora – FUNG - 1995
45 A geometric approach to mapping bitext correspondence – Melamed - 1996
43 Deriving translation data from bilingual texts – Catizone, Russell, et al. - 1989
34 Good applications for crummy machine translation – Church, Hovy - 1993
28 Building an MT dictionary from parallel texts based on linguistic and statistical information – Kumano, Hirakawa - 1994
26 Learning an English-Chinese lexicon from a parallel corpus – Wu, Xia - 1994
24 Compiling bilingual lexicon entries from a non-parallel English-Chinese corpus – Fung - 1995
24 A portable algorithm for mapping bitext correspondence – Melamed - 1997
24 Semi-automatic acquisition of domain-specific translation lexicons – Resnik, Melamed - 1997
20 Robust Word Alignment for Machine Aided Translation – Dagan, Church - 1993
17 How to compile a bilingual collocational lexicon automatically – Smadja - 1992
16 But dictionaries are data too – Brown, Pietra, et al. - 1993
13 Using bi-textual alignment for translation validation: the TransCheck system – Macklovitch - 1994
13 Line 'Em Up: Advances in Alignment Technology and their Impact on Translation Support Tools – Macklovitch - 1996
10 Measuring Semantic Entropy – Melamed - 1997
8 Melamed "Automatic Detection of Omissions in Translations – D - 1996
5 Evaluation of Machine Translation – White - 1993
2 personal communication – Nasr - 1997
1 TransSearch: A Bilingual Concordance Tool." Centre d'innovation en technologies de l'information – Simard, Foster - 1993