Results 1 - 10
of
20
Source-Language Entailment Modeling for Translating Unknown Terms
"... This paper addresses the task of handling unknown terms in SMT. We propose using source-language monolingual models and resources to paraphrase the source text prior to translation. We further present a conceptual extension to prior work by allowing translations of entailed texts rather than paraphr ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
(Show Context)
This paper addresses the task of handling unknown terms in SMT. We propose using source-language monolingual models and resources to paraphrase the source text prior to translation. We further present a conceptual extension to prior work by allowing translations of entailed texts rather than paraphrases only. A method for performing this process efficiently is presented and applied to some 2500 sentences with unknown terms. Our experiments show that the proposed approach substantially increases the number of properly translated texts. 1
Improvements in analogical learning: application to translating multi-terms of the medical domain
- In Proceedings of the 12th Conference of the European Chapter of the ACL (EACL, 2009
"... Handling terminology is an important matter in a translation workflow. However, current Machine Translation (MT) systems do not yet propose anything proactive upon tools which assist in managing terminological databases. In this work, we investigate several enhancements to analogical learning and te ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
(Show Context)
Handling terminology is an important matter in a translation workflow. However, current Machine Translation (MT) systems do not yet propose anything proactive upon tools which assist in managing terminological databases. In this work, we investigate several enhancements to analogical learning and test our implementation on translating medical terms. We show that the analogical engine works equally well when translating from and into a morphologically rich language, or when dealing with language pairs written in different scripts. Combining it with a phrasebased statistical engine leads to significant improvements. 1
Scaling up analogical learning
- In Proceedings of the 22 nd International Conference on Computational Linguistics (COLING, 2008
, 2008
"... Recent years have witnessed a growing interest in analogical learning for NLP applications. If the principle of analogical learning is quite simple, it does involve complex steps that seriously limit its applicability, the most computationally demanding one being the identification of analogies in t ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
(Show Context)
Recent years have witnessed a growing interest in analogical learning for NLP applications. If the principle of analogical learning is quite simple, it does involve complex steps that seriously limit its applicability, the most computationally demanding one being the identification of analogies in the input space. In this study, we investigate different strategies for efficiently solving this problem and study their scalability. 1
Formal models of analogical proportions
, 2006
"... Formal Models of Analogical Proportions Natural Language Processing (NLP) applications rely, in an increasing number of operational contexts, on machine learning mechanisms which are able to extract, in an entirely automated manner, linguistic regularities from annotated corpora. Among these, analog ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Formal Models of Analogical Proportions Natural Language Processing (NLP) applications rely, in an increasing number of operational contexts, on machine learning mechanisms which are able to extract, in an entirely automated manner, linguistic regularities from annotated corpora. Among these, analogical learning is characterized by the systematic exploitation, in a symbolic machine learning apparatus, of formal proportionality relationships that exist between training instances. In this paper, we propose a general definition of these proportionality relation-ships, based on a generic algebraic framework. This definition is specialized to handle a number of representations that are commonly encountered in NLP appli-cations, such as words over a finite alphabet, feature structures, labeled trees, etc. In each of these cases, we provide and discuss algorithms for answering the two main computational challenges posed by proportionality relationships: the valida-tion of a proportion and the computation of the fourth term of a proportion. 1 ha l-0
analysis for Information Retrieval in the biomedical
"... Unsupervised and semi-supervised morphological ..."
unknown title
"... Acquisition of morphological families and derivational series from a machine readable dictionary ..."
Abstract
- Add to MetaCart
(Show Context)
Acquisition of morphological families and derivational series from a machine readable dictionary
Using character overlap to improve language transformation
"... Language transformation can be defined as translating between diachronically distinct language variants. We investigate the transformation of Middle Dutch into Modern Dutch by means of machine translation. We demonstrate that by using character overlap the performance of the machine translation proc ..."
Abstract
- Add to MetaCart
(Show Context)
Language transformation can be defined as translating between diachronically distinct language variants. We investigate the transformation of Middle Dutch into Modern Dutch by means of machine translation. We demonstrate that by using character overlap the performance of the machine translation process can be improved for this task. 1
unknown title
"... A corpus study on the number of true proportional analogies between chunks in two typologically different languages ..."
Abstract
- Add to MetaCart
(Show Context)
A corpus study on the number of true proportional analogies between chunks in two typologically different languages
Using a Random Forest Classifier to recognise translations of biomedical terms across languages
"... We present a novel method to recognise semantic equivalents of biomedical terms in language pairs. We hypothesise that biomedical term are formed by semantically similar textual units across languages. Based on this hypothesis, we employ a Random Forest (RF) classifier that is able to automatically ..."
Abstract
- Add to MetaCart
(Show Context)
We present a novel method to recognise semantic equivalents of biomedical terms in language pairs. We hypothesise that biomedical term are formed by semantically similar textual units across languages. Based on this hypothesis, we employ a Random Forest (RF) classifier that is able to automatically mine higher order associations between textual units of the source and target language when trained on a corpus of both positive and negative examples. We apply our method on two language pairs: one that uses the same character set and another with a different script, English-French and English-Chinese, respectively. We show that English-French pairs of terms are highly transliterated in contrast to the English-Chinese pairs. Nonetheless, our method performs robustly on both cases. We evaluate RF against a state-of-the-art alignment method, GIZA++, and we report a statistically significant improvement. Finally, we compare RF against Support Vector Machines and analyse our results. 1