12 citations found. Retrieving documents...
Melamed, I. D. (1997). "Automatic discovery of noncompositional compounds in parallel data".

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:
A Simple Hybrid Aligner for Generating Lexical.. - Ahrenberg, ANDERSSON, .. (1998)   (3 citations)  (Correct)

....with 99 precision and 46 recall when trained on 13 million words of the Hansard corpus, where recall was measured as the fraction of words from the bitext that were assigned some translation. Using the same model but less data, a French English software manual of 400,000 words, Resnik and Melamed (1997) reported 94 precision with 30 recall. While these figures are indeed impressive, more telling figures can only be obtained by measuring the effect of the alignment system on some specific task. Dagan and Church (1994) reports that their Termight system helped double the speed at which terminology lists could be compiled at ....

Melamed, I. Dan. (1997b) "Automatic Discovery of NonCompositional Compounds in Parallel Data." Paper presented at the 2nd Conference on Empirical Methods in Natural Language Processing, Providence.


Parallel Strands: A Preliminary Investigation into Mining the Web .. - Resnik (1998)   (3 citations)  (Correct)

....be accurate enough to apply without human intervention. 1 Introduction In recent years large parallel corpora have taken on an important role as resources in machine translation and multilingual natural language processing, for such purposes as lexical acquisition (e.g. Gale and Church, 1991a; Melamed, 1997), statistical translation models (e.g. Brown et al. 1990; Melamed 1998) and cross language information retrieval (e.g. Davis and Dunning, 1995; Landauer and Littman, 1990; also see Oard, 1997) However, for all but relatively few language pairs, parallel corpora are available only in relatively ....

.... also see Oard, 1997) However, for all but relatively few language pairs, parallel corpora are available only in relatively specialized forms such as United Nations proceedings (LDC, 1996) religious texts (Resnik, Olsen, and Diab, 1998) and localized versions of software manuals (Resnik and Melamed, 1997). Even for the top dozen or so majority languages, the available parallel corpora tend to be unbalanced, representing primarily governmental and newswire style texts. In addition, like other language resources, parallel corpora are often encumbered by fees or licensing restrictions. For all these ....

Melamed, I. D. (1997). Automatic discovery of non-compositional compounds in parallel data. In Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing (EMNLP-97), Brown University.


Rapid Prototyping of Domain-Specific Machine Translation.. - Palmer, Rambow, Nasr (1998)   (Correct)

....The MLB files are ordered so that in case of multiple occurrence of a key, the different entries for that key are ranked. Finally, the MLB files are automatically processed to generate a fast loadable version of the transfer rules. SABLE is a tool for analyzing bilingual corpora (or bitexts ) (Melamed, July 1997). SABLE can induce domain specific bilingual transfer lexicons (Resnik and Melamed, 1997) using a fast algorithm for estimating a partial translation model. A translation model is a set of transfer pairs, consisting of one word from each language which are (in some context in the bitext) a ....

....for that key are ranked. Finally, the MLB files are automatically processed to generate a fast loadable version of the transfer rules. SABLE is a tool for analyzing bilingual corpora (or bitexts ) Melamed, July 1997) SABLE can induce domain specific bilingual transfer lexicons (Resnik and Melamed, 1997) using a fast algorithm for estimating a partial translation model. A translation model is a set of transfer pairs, consisting of one word from each language which are (in some context in the bitext) a translation of one another. The model s accuracy coverage trade off can be directly controlled ....

Melamed, I. D. (July, 1997). Automatic Discovery of Non-Compositional Compounds in Parallel Data. In Proceedings of the ACL-97, Madrid, Spain.


Multilingual Domain Modeling in Twenty-One - Automatic Creation.. - Hiemstra (1998)   (2 citations)  (Correct)

....Pietra, V.J. Della Pietra, and R.L. Mercer 1993) fertility parameters could be added for both languages instead of adding them just for one language. Another interesting approach would be to include a method for correcting the initial tokenisation of the parallel corpus as proposed by Melamed (I.D. Melamed 1997b) in order to extract the multi word expressions found by the algorithm explicitly. Finally we hope to evaluate the performance of the algorithms on bigger and possibly more noisy corpora than the Agenda 21 corpus. Acknowledgements I would like to thank the following people: Franciska de ....

I.D. Melamed (1997b), Automatic discovery of non-compositional compounds in parallel data., In Second Conference on Empirical Methods in Natural LanguageProcessing (EMNLP'97).


Creating a Parallel Corpus from the "Book of 2000 Tongues" - Resnik, Olsen, Diab   (Correct)

....only a simple 3 level hierarchy of text elements (book, chapter, verse) In our initial pass through the annotation process (see below) we are labeling elements as b (book) c 9 Special issues arise in automatically creating translation lexicons that include non compositional pairs. See, e.g. (Melamed, 1997) and Section 4.2. 10 hhttp: www rali.iro.umontreal.ca TransSearch TS simple uen.cgii (chapter) and v (verse) producing an intermediate representation that captures the major structural levels without conforming to any particular DTD. The following examples show a single verse, Matthew 1:7, ....

....a multi way parallel corpus with representation from every language family, with the content carefully translated and nearly sentence level alignment included. Although it is not the largest of corpora, parallel corpora of significantly smaller size have yielded useful results, e.g. Resnik and Melamed, 1997), and although its content is more specialized than, say, contemporary newspaper text, it does cover a very wide range of linguistic phenomena and domains of world knowledge; for example, see the range of conceptual categories in the Louw Nida thesaurus for the New Testament (Louw and Nida, ....

[Article contains additional citation context not shown here]

Melamed, I. Dan. 1997. Automatic discovery of non-compositional compounds in parallel data.


Statistical Machine Translation - Al-Onaizan, Curin, Jahr, Knight.. (1999)   Self-citation (Melamed)   (Correct)

....the vocabulary size and improve statistics. ffl Phrases. It is frequently possible to locate word sequences that translates as a whole (perhaps noncompositionally) It would be useful for translation models to uncover these and for decoding to take these into account (see [Och et al. 1999; Melamed, 1997]) ffl Better initial alignments. Models 3, 4, and 5 collect their counts over a small subset al..ignments, so it is critical that the simple models used to bootstrap them are accurate, particularly with small data sets (see [Melamed, 1998a] 11 Acknowledgements Thanks to staff of IFAL, Charles ....

Melamed, I. Dan. 1997. Automatic discovery of non-compositional compounds. In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing.


Empirical Methods for MT Lexicon Development - Melamed (1998)   (1 citation)  Self-citation (Melamed)   (Correct)

No context found.

I. D. Melamed. (1997c) "Automatic Discovery of Non-Compositional Compounds, " Proceedings of the Second Conference on Empirical Methods in Natural Language Processing. Providence, RI.


Parallel Text Collections at Linguistic Data Consortium - Xiaoyi Ma Linguistic (1999)   (Correct)

No context found.

Melamed, I. D. (1997). "Automatic discovery of noncompositional compounds in parallel data".


Combining Evidence in Cognate Identification - Kondrak (2004)   (1 citation)  (Correct)

No context found.

I. Dan Melamed. Automatic discovery of non-compositional compounds in parallel data. In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pages 97-108, 1997.


Decoding Complexity in Word-Replacement Translation Models - Knight (1999)   (4 citations)  (Correct)

No context found.

I. Dan Melamed. 1997. Automatic discovery of non-compositional compounds. In Proceedings of the Second Conference on Empirical Methods in Natural Language Processing.


Extracting Phrasal Terms using Bitext - Tiedemann   (Correct)

No context found.

I. Dan Melamed. 1997. Automatic Discovery of Non-Compositional Compounds in Parallel Data.


Evaluating Word Alignment Systems - Merkel, Ahrenberg   (Correct)

No context found.

Melamed, I. D. (1997b) Automatic Discovery of Non-Compositional Compounds in Parallel Data. In Proceedings of the 2nd Conference on Empirical Methods in Natural Language Processing (EMNLP'97), Providence, RI.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC