Results 1 -
8 of
8
Findings of the 2012 workshop on statistical machine translation.
- In Proc. of WMT,
, 2012
"... Abstract This paper presents the results of the WMT12 shared tasks, which included a translation task, a task for machine translation evaluation metrics, and a task for run-time estimation of machine translation quality. We conducted a large-scale manual evaluation of 103 machine translation system ..."
Abstract
-
Cited by 46 (7 self)
- Add to MetaCart
Abstract This paper presents the results of the WMT12 shared tasks, which included a translation task, a task for machine translation evaluation metrics, and a task for run-time estimation of machine translation quality. We conducted a large-scale manual evaluation of 103 machine translation systems submitted by 34 teams. We used the ranking of these systems to measure how strongly automatic metrics correlate with human judgments of translation quality for 12 evaluation metrics. We introduced a new quality estimation task this year, and evaluated submissions from 11 teams.
Modelling and Optimizing on Syntactic N-Grams for Statistical Machine Translation.
- Transactions of the Association for Computational Linguistics,
, 2015
"... Abstract The role of language models in SMT is to promote fluent translation output, but traditional n-gram language models are unable to capture fluency phenomena between distant words, such as some morphological agreement phenomena, subcategorisation, and syntactic collocations with string-level ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract The role of language models in SMT is to promote fluent translation output, but traditional n-gram language models are unable to capture fluency phenomena between distant words, such as some morphological agreement phenomena, subcategorisation, and syntactic collocations with string-level gaps. Syntactic language models have the potential to fill this modelling gap. We propose a language model for dependency structures that is relational rather than configurational and thus particularly suited for languages with a (relatively) free word order. It is trainable with Neural Networks, and not only improves over standard n-gram language models, but also outperforms related syntactic language models. We empirically demonstrate its effectiveness in terms of perplexity and as a feature function in string-to-tree SMT from English to German and Russian. We also show that using a syntactic evaluation metric to tune the log-linear parameters of an SMT system further increases translation quality when coupled with a syntactic language model.
2012. Towards a PredicateArgument evaluation for MT
- In Proceeding of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation
"... Abstract HMEANT ..."
GRAFIX: Automated Rule-Based Post Editing System to Improve English-Persian SMT Output
"... This paper describes the latest developments in the PeEn-SMT system, specifically covering experiments with Grafix, an APE component developed for PeEn-SMT. The success of well-designed SMT systems has made this approach one of the most popular MT approaches. However, MT output is often seriously gr ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
This paper describes the latest developments in the PeEn-SMT system, specifically covering experiments with Grafix, an APE component developed for PeEn-SMT. The success of well-designed SMT systems has made this approach one of the most popular MT approaches. However, MT output is often seriously grammatically incorrect. This is more prevalent in SMT since this approach is not language-specific. This system works with Persian, a morphologically rich language, so post-editing output is an important step in maintaining translation fluency. Grafix performs a range of corrections on sentences, from lexical transformation to complex syntactical rearrangement. It analyzes the target sentence (the SMT output in Persian language) and attempts to correct it by applying a number of rules which enforce consistency with Persian grammar. We show that the proposed system is able to improve the quality of the state-of-the-art English-Persian SMT systems, yielding promising results from both automatic and manual evaluation techniques.
The Prague Bulletin of Mathematical Linguistics Czech Machine Translation in the project CzechMATE
"... Abstract We present various achievements in statistical machine translation from English, German, Spanish and French into Czech. We discuss specific properties of the individual source languages and describe techniques that exploit these properties and address language-specific errors. Besides the ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract We present various achievements in statistical machine translation from English, German, Spanish and French into Czech. We discuss specific properties of the individual source languages and describe techniques that exploit these properties and address language-specific errors. Besides the translation proper, we also present our contribution to error analysis.
A Three-Layer Architecture for Automatic Post-Editing System Using Rule-Based Paradigm
"... This paper proposes a post-editing model in which our three-level rule-based automatic post-editing engine called Grafix is presented to refine the output of machine translation systems. The type of corrections on sentences varies from lexical transformation to complex syntactical rearrangement. The ..."
Abstract
- Add to MetaCart
This paper proposes a post-editing model in which our three-level rule-based automatic post-editing engine called Grafix is presented to refine the output of machine translation systems. The type of corrections on sentences varies from lexical transformation to complex syntactical rearrangement. The experimental results both in manual and automatic evaluations show that the proposed system is able to improve the quality of our state-of-the-art English-Persian SMT system. 1
Lost and Found in Translation: Cross-Lingual Question Answering with Result Translation
, 2012
"... Using cross-lingual question answering (CLQA), users can find information in languages that they do not know. In this thesis, we consider the broader problem of CLQA with result translation, where answers retrieved by a CLQA system must be translated back to the user’s language by a machine translat ..."
Abstract
- Add to MetaCart
Using cross-lingual question answering (CLQA), users can find information in languages that they do not know. In this thesis, we consider the broader problem of CLQA with result translation, where answers retrieved by a CLQA system must be translated back to the user’s language by a machine translation (MT) system. This task is challenging because answers must be both relevant to the question and adequately translated in order to be correct. In this work, we show that integrating the MT closely with cross-lingual retrieval can improve result relevance and we further demonstrate that automatically correcting errors in the MT output can improve the adequacy of translated results. To understand the task better, we undertake detailed error analyses examining the impact of MT errors on CLQA with result translation. We identify which MT errors are most detrimental to the task and how different cross-lingual information retrieval (CLIR) systems respond to different kinds of MT errors. We describe two main types of CLQA errors caused by MT errors: lost in retrieval errors, where relevant results are not returned, and lost in translation errors, where relevant results are perceived irrelevant due to inadequate MT. To address the lost in retrieval errors, we introduce two novel models for cross-lingual informa-
Machine Translation within One Language as a Paraphrasing Technique
"... Abstract: We present a method for improving machine translation (MT) evaluation by targeted paraphrasing of reference sentences. For this purpose, we employ MT sys-tems themselves and adapt them for translating within a single language. We describe this attempt on two types of MT systems – phrase-ba ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract: We present a method for improving machine translation (MT) evaluation by targeted paraphrasing of reference sentences. For this purpose, we employ MT sys-tems themselves and adapt them for translating within a single language. We describe this attempt on two types of MT systems – phrase-based and rule-based. Initially, we experiment with the freely available SMT system Moses. We create translation models from two available sources of Czech paraphrases – Czech WordNet and the Meteor Para-phrase tables. We extended Moses by a new feature that makes the translation targeted. However, the results of this method are inconclusive. In the view of errors appearing in the new paraphrased sentences, we propose another so-lution – targeted paraphrasing using parts of a rule-based translation system included in the NLP framework Treex. 1