• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. (1997)

by D Wu
Venue:Computational Linguistics
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 571
Next 10 →

Statistical phrase-based translation

by Franz Josef Och, Daniel Marcu , 2003
"... We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models. Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models outpe ..."
Abstract - Cited by 944 (11 self) - Add to MetaCart
We propose a new phrase-based translation model and decoding algorithm that enables us to evaluate and compare several, previously proposed phrase-based translation models. Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models outperform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. Surprisingly, learning phrases longer than three words and learning phrases from high-accuracy wordlevel alignment models does not have a strong impact on performance. Learning only syntactically motivated phrases degrades the performance of our systems. 1
(Show Context)

Citation Context

...her motivation to evaluate the performance of a phrase translation model that contains only syntactic phrases comes from recent efforts to built syntactic translation models [Yamada and Knight, 2001; =-=Wu, 1997-=-]. In these models, reordering of words is restricted to reordering of constituents in well-formed syntactic parse trees. When augmenting such models with phrase translations, typically only translati...

Hierarchical phrase-based translation

by David Chiang - Computational Linguistics , 2007
"... We present a statistical machine translation model that uses hierarchical phrases—phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a parallel text without any syntactic annotations. Thus it can be seen as combining fundamental ideas from b ..."
Abstract - Cited by 597 (9 self) - Add to MetaCart
We present a statistical machine translation model that uses hierarchical phrases—phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a parallel text without any syntactic annotations. Thus it can be seen as combining fundamental ideas from both syntax-based translation and phrase-based translation. We describe our system’s training and decoding methods in detail, and evaluate it for translation speed and translation accuracy. Using BLEU as a metric of translation accuracy, we find that our system performs significantly better than the Alignment Template System, a state-of-the-art phrasebased system. 1.
(Show Context)

Citation Context

...el text without syntactic annotation. Because our system uses a synchronous CFG, it could be thought of as an example of syntax-based statistical machine translation (MT), joining a line of research (=-=Wu 1997-=-; Alshawi, Bangalore, and Douglas 2000; Yamada and Knight 2001)that has been fruitful but has not previously produced systems that can compete with phrase-based systems in large-scale translation task...

A hierarchical phrase-based model for statistical machine translation

by David Chiang - IN ACL , 2005
"... We present a statistical phrase-based translation model that uses hierarchical phrases— phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a bitext without any syntactic information. Thus it can be seen as a shift to the formal machinery of ..."
Abstract - Cited by 491 (12 self) - Add to MetaCart
We present a statistical phrase-based translation model that uses hierarchical phrases— phrases that contain subphrases. The model is formally a synchronous context-free grammar but is learned from a bitext without any syntactic information. Thus it can be seen as a shift to the formal machinery of syntaxbased translation systems without any linguistic commitment. In our experiments using BLEU as a metric, the hierarchical phrasebased model achieves a relative improvement of 7.5 % over Pharaoh, a state-of-the-art phrase-based system.
(Show Context)

Citation Context

...l text without relying on any linguistic annotations or assumptions; the result sometimes resembles a syntactician’s grammar but often does not. In this respect it resembles Wu’ssbilingual bracketer (=-=Wu, 1997-=-), but ours uses a different extraction method that allows more than one lexical item in a rule, in keeping with the phrasebased philosophy. Our extraction method is basically the same as that of Bloc...

What's in a Translation Rule?

by Michel Galley, Mark Hopkins, Kevin Knight, Daniel Marcu
"... We propose a theory that gives formal semantics to word-level alignments defined over parallel corpora. We use our theory to introduce a linear algorithm that can be used to derive from word-aligned, parallel corpora the minimal set of syntactically motivated transformation rules that explain human ..."
Abstract - Cited by 297 (40 self) - Add to MetaCart
We propose a theory that gives formal semantics to word-level alignments defined over parallel corpora. We use our theory to introduce a linear algorithm that can be used to derive from word-aligned, parallel corpora the minimal set of syntactically motivated transformation rules that explain human translation data.

Scalable inference and training of context-rich syntactic translation models

by Michel Galley, Jonathan Graehl, Kevin Knight, Daniel Marcu, Steve Deneefe, Wei Wang, Ignacio Thayer - In ACL , 2006
"... Statistical MT has made great progress in the last few years, but current translation models are weak on re-ordering and target language fluency. Syntactic approaches seek to remedy these problems. In this paper, we take the framework for acquiring multi-level syntactic translation rules of (Galley ..."
Abstract - Cited by 280 (21 self) - Add to MetaCart
Statistical MT has made great progress in the last few years, but current translation models are weak on re-ordering and target language fluency. Syntactic approaches seek to remedy these problems. In this paper, we take the framework for acquiring multi-level syntactic translation rules of (Galley et al., 2004) from aligned tree-string pairs, and present two main extensions of their approach: first, instead of merely computing a single derivation that minimally explains a sentence pair, we construct a large number of derivations that include contextually richer rules, and account for multiple interpretations of unaligned words. Second, we propose probability estimates and a training procedure for weighting these rules. We contrast different approaches on real examples, show that our estimates based on multiple derivations favor phrasal re-orderings that are linguistically better motivated, and establish that our larger rules provide a 3.63 BLEU point increase over minimal rules. 1
(Show Context)

Citation Context

...important area of future work will be the improvement of our decoder, conjointly with analyses of the impact in terms of BLEU of contextually richer rules. 7 Related work Similarly to (Poutsma, 2000; =-=Wu, 1997-=-; Yamada and Knight, 2001; Chiang, 2005), the rules discussed in this paper are equivalent to productions of synchronous tree substitution grammars. We believe that our tree-to-string model has severa...

Tree-to-String Alignment Template for Statistical Machine Translation

by Yang Liu, et al. , 2006
"... We present a novel translation model based on tree-to-string alignment template (TAT) which describes the alignment between a source parse tree and a target string. A TAT is capable of generating both terminals and non-terminals and performing reordering at both low and high levels. The model is lin ..."
Abstract - Cited by 173 (32 self) - Add to MetaCart
We present a novel translation model based on tree-to-string alignment template (TAT) which describes the alignment between a source parse tree and a target string. A TAT is capable of generating both terminals and non-terminals and performing reordering at both low and high levels. The model is linguistically syntaxbased because TATs are extracted automatically from word-aligned, source side parsed parallel texts. To translate a source sentence, we first employ a parser to produce a source parse tree and then apply TATs to transform the tree into a target string. Our experiments show that the TAT-based model significantly outperforms Pharaoh, a state-of-the-art decoder for phrase-based models.

Training Tree Transducers

by Jonathan Graehl, Kevin Knight - IN HLT-NAACL , 2004
"... Many probabilistic models for natural language are now written in terms of hierarchical tree structure. Tree-based modeling still lacks many of the standard tools taken for granted in (finite-state) string-based modeling. The theory of tree transducer automata provides a possible framework to ..."
Abstract - Cited by 132 (12 self) - Add to MetaCart
Many probabilistic models for natural language are now written in terms of hierarchical tree structure. Tree-based modeling still lacks many of the standard tools taken for granted in (finite-state) string-based modeling. The theory of tree transducer automata provides a possible framework to draw on, as it has been worked out in an extensive literature. We motivate the use of tree transducers for natural language and address the training problem for probabilistic tree-totree and tree-to-string transducers.

Bootstrapping Parsers via Syntactic Projection across Parallel Texts

by Rebecca Hwa, Philip Resnik, Amy Weinberg, Clara Cabezas, Okan Kolak - Natural Language Engineering , 2005
"... Broad coverage, high quality parsers are available for only a handful of languages. A prerequisite for developing broad coverage parsers for more languages is the annotation of text with the desired linguistic representations (also known as “treebanking”). However, syntactic annotation is a labor in ..."
Abstract - Cited by 129 (3 self) - Add to MetaCart
Broad coverage, high quality parsers are available for only a handful of languages. A prerequisite for developing broad coverage parsers for more languages is the annotation of text with the desired linguistic representations (also known as “treebanking”). However, syntactic annotation is a labor intensive and time-consuming process, and it is difficult to find linguistically annotated text in sufficient quantities. In this article, we explore using parallel text to help solving the problem of creating syntactic annotation in more languages. The central idea is to annotate the English side of a parallel corpus, project the analysis to the second language, and then train a stochastic analyzer on the resulting noisy annotations. We discuss our background assumptions, describe an initial study on the “projectability ” of syntactic relations, and then present two experiments in which stochastic parsers are developed with minimal human intervention via projection from English. 1
(Show Context)

Citation Context

...o the syntactic graph of F . Whether or not stated explicitly, the DCA is actually an underlying assumption in most formal attempts to model cross-language correspondences in syntactic relationships (=-=Wu 1995-=-; Alshawi et al. 2000; Yamada and Knight 2001; Eisner 2003; Gildea 2003; Melamed et al. 2004; Smith and Smith 2004). Table 1 illustrates the principle with the following English-Basque sentence pair a...

Statistical syntax-directed translation with extended domain of locality

by Liang Huang - In Proc. AMTA 2006 , 2006
"... A syntax-directed translator first parses the source-language input into a parsetree, and then recursively converts the tree into a string in the target-language. We model this conversion by an extended treeto-string transducer that have multi-level trees on the source-side, which gives our system m ..."
Abstract - Cited by 121 (14 self) - Add to MetaCart
A syntax-directed translator first parses the source-language input into a parsetree, and then recursively converts the tree into a string in the target-language. We model this conversion by an extended treeto-string transducer that have multi-level trees on the source-side, which gives our system more expressive power and flexibility. We also define a direct probability model and use a linear-time dynamic programming algorithm to search for the best derivation. The model is then extended to the general log-linear framework in order to rescore with other features like n-gram language models. We devise a simple-yet-effective algorithm to generate non-duplicate k-best translations for n-gram rescoring. Initial experimental results on English-to-Chinese translation are presented. 1

11,001 new features for statistical machine translation

by David Chiang, Kevin Knight, Wei Wang - In North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT , 2009
"... We use the Margin Infused Relaxed Algorithm of Crammer et al. to add a large number of new features to two machine translation systems: the Hiero hierarchical phrasebased translation system and our syntax-based translation system. On a large-scale Chinese-English translation task, we obtain statisti ..."
Abstract - Cited by 117 (2 self) - Add to MetaCart
We use the Margin Infused Relaxed Algorithm of Crammer et al. to add a large number of new features to two machine translation systems: the Hiero hierarchical phrasebased translation system and our syntax-based translation system. On a large-scale Chinese-English translation task, we obtain statistically significant improvements of +1.5 Bleu and +1.1 Bleu, respectively. We analyze the impact of the new features and the performance of the learning algorithm. 1
(Show Context)

Citation Context

...ed parallel text, are synchronous CFG productions, for example: X → X1 de X2, X2 of X1 As the number of nonterminals is limited to two, the grammar is equivalent to an inversion transduction grammar (=-=Wu, 1997-=-). The baseline model includes 12 features whose weights are optimized using MERT. Two of the features are n-gram language models, which require intersecting the synchronous CFG with finite-state auto...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University