Results 1 -
3 of
3
Generative Lexicon Principles for Machine Translation: A Case for Meta-Lexical Structure
- Machine Translation
, 1995
"... This paper addresses two types of mismatches in the translation of reported speech between German and English. The first mismatch is between the repeated use of the reported speech construction in English and the use of subjunctive in German used to indicate continued attribution. The second mismatc ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
This paper addresses two types of mismatches in the translation of reported speech between German and English. The first mismatch is between the repeated use of the reported speech construction in English and the use of subjunctive in German used to indicate continued attribution. The second mismatch concerns the difference in usage of metonymic extensions in the subject position of reported speech.
Determiners and Number in English Contrasted With Japanese, as exemplified . . .
, 2001
"... The fact that concepts are grammaticalized di#erently in different languages is a major problem for translation, especially for machine translation. Two major examples of this are syntactic number, and the use of (in)definite articles (a, some, the). In languages such as English, nouns are marked fo ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
The fact that concepts are grammaticalized di#erently in different languages is a major problem for translation, especially for machine translation. Two major examples of this are syntactic number, and the use of (in)definite articles (a, some, the). In languages such as English, nouns are marked for number and the choice of article (or of no article) must be made for every noun phrase. In contrast, for languages such as Japanese, number distinctions are not normally made, and there are no articles. This means that whenever a noun phrase is translated from Japanese to English, even if the denotation is perfectly understood and a good translation equivalent found, generating the noun phrase still requires two difficult choices: should the head noun be singular or plural, and which article, if any, should be generated. This thesis proposes a semantic representation and a series of three heuristic algorithms that make possible the appropriate generation of articles and number when translating from Japanese to English. The semantic representation provides a tractable set of features to represent (1) the referential use of a noun phrase, as either referential, generic, ascriptive or idiomatic; (2) the interpretation of the noun phrase's referent as either a countable individual or a mass, with seven detailed subtypes; (3) the definiteness of the noun phrase, as either definite, indefinite, definite and extensive, or possessed. The three algorithms automatically acquire values for these features from the analysis of the Japanese text and the lexical properties of the English translation equivalents, and then use them to generate English. The first algorithm determines the referential use of Japanese noun phrases, based on a defeasible hierarchy of pragmatic rules that are applie...
Corpus-based acquisition of head noun countability features
- Master’s thesis
, 2002
"... In recent years, significant advances have been made in the use of corpora as tools in language processing. Lexical acquisiton techinques have been somewhat successful in learning verb subcategorization information. Yet much of the other information available from corpora has not been harnessed. The ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
In recent years, significant advances have been made in the use of corpora as tools in language processing. Lexical acquisiton techinques have been somewhat successful in learning verb subcategorization information. Yet much of the other information available from corpora has not been harnessed. The countability property of nouns is one property that would be useful to acquire. Such information could help in word sense disambiguation, in determining appropriate determiners during generation (especially in the case of machine translation), and as a lexicographic resource during dictionary construction. Existing lexical resources which include countability features of nouns have been created largely by hand. Manual tagging of noun countability is expensive in terms of time and labor. It is difficult to extend such resources as new terminology emerges. This thesis presents a method of automatically acquiring countability properties of head nouns. This information is gathered from a part-of-speech tagged corpus, specifically the British National Corpus (BNC). Basic noun phrase chunking is performed on the corpus to obtain head nouns and their accompanying determiner, if any. Highreliability

