29 citations found. Retrieving documents...
S. Chen. Aligning sentences in bilingual corpora using lexical information. In ACL-93, pages 9--13, Columbia, Ohio, 1993.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

A Geometric Approach to Mapping Bitext Correspondence - Melamed (1996)   (11 citations)  (Correct)

....al. 1991, Gale Church 1991) However, these algorithms can fumble in bitext sections that contain many sentences of very similar length, like this vote record: English French Mr. McInnis M. McInnis Yes. Oui. Mr. Saunders M. Saunders No. Non. Mr. Cossitt M. Cossitt Yes. Oui. Source: (Chen 1993) The only way to ensure a correct alignment in such regions is to look at the words. For this reason, Chen (1993) adds a statistical translation model to the Brown et al. alignment algorithm, and Wu (1994) adds a translation lexicon to the Gale ; Church alignment algorithm. A set of points of ....

....of very similar length, like this vote record: English French Mr. McInnis M. McInnis Yes. Oui. Mr. Saunders M. Saunders No. Non. Mr. Cossitt M. Cossitt Yes. Oui. Source: Chen 1993) The only way to ensure a correct alignment in such regions is to look at the words. For this reason, Chen (1993) adds a statistical translation model to the Brown et al. alignment algorithm, and Wu (1994) adds a translation lexicon to the Gale ; Church alignment algorithm. A set of points of correspondence leads to alignment more directly than a translation model or a translation lexicon, because points ....

[Article contains additional citation context not shown here]

S. Chen, "Aligning Sentences in Bilingual Corpora Using Lexical Information, " Proceedings of the 31st Annual Meeting of the Association for Compntational Linguistics, Columbus, OH, 1993.


Deriving Transfer Rules from Dominance-Preserving.. - Meyers, Yangarber.. (1997)   (1 citation)  (Correct)

....mappings from source substructures to target substructures. 1 Introduction Automatic acquisition of translation rules from parallel sentence aligned text has taken a variety of forms. Some machine translation (MT) systems treat aligned sentences as unstructured word sequences (see for example, [4], 2] 3] Other systems, including our own (cf. 8] and [14] obtain a syntactic analysis of the sentences (parse) before acquiring transfer rules (cf. 12] 9] 10] 13] and [11] This approach has the advantage of acquiring structural as well as lexical correspondences. A large, ....

S. Chen. Aligning Sentences in Bilingual Corpora using lexical information. In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, 1993.


A Multilingual Procedure for Dictionary-based Sentence.. - Meyers, Kosaka, Grishman (1998)   (Correct)

....encountered any of these cases in our data. Our approach does not handle these, but others can (e.g. 11] Previously reported alignment procedures score how well sets of source and target sentences match based on sentence length ( 2] 7] automatically acquired lexical information ( 13] [4]) and other sources. 24] and [11] combine automatically acquired lexical information with information from an on line dictionary. Still other work aligns texts at the word level ( 18] 19] 14] In this paper, we propose an efficient and accurate alignment procedure based primarily on on line ....

S. Chen. Aligning Sentences in Bilingual Corpora using lexical information. In ACL93, pages 9--16, 1993.


Translation Knowledge Acquisition from Cross-Lingually Relevant.. - Utsuro   (Correct)

....within parallel text, and, ii) acquisition of translation knowledge from sentence aligned parallel text. 2.1.1 Aligning Sentences Phrases Words within Parallel Text Some of earlier works on parallel text alignment studied techniques for aligning parallel text at sentence level. For instance, [Gale93, Chen93, Utsuro94] studied sentence alignment techniques based on dynamic programming, using sentence length and lexical mapping information across languages. Kay93, Haruno96b] applied iterative refinement algorithms to sentence level alignment tasks. Or, assuming parallel text being aligned at sentence level, ....

Chen, S. F.: Aligning Sentences in Bilingual Corpora Using Lexical Information, Proc. 31st ACL, pp. 9 16 (1993).


Machine Learning of Language Translation Rules - Tenni, Lehtola, Bounsaythip..   (Correct)

....The optimisation finds the optimal sentence correspondences from the matrix of matching probabilities. Several methods have been applied to the problem, like Gale and Church method length based and alignment type probability alignment [7] cognates based alignment [14] and statistical alignment [4]. For the CL approach, we have developed a new dictionarybased method for sentence alignment. With the dictionary, we onwards mean a bilingual lexicon providing translation equivalencies of words. Sentence segmentation heuristics of Webtran, is based on dot neighbourhood as the meaning of dot is ....

S. F. Chen, "Aligning Sentences in Bilingual Corpora Using Lexical Information", Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, 1993, pp. 9-16.


Sentence Alignment in Parallel Corpora: The Asahi Corpus.. - Collier, K.Takahashi (1995)   (Correct)

....Scientific America articles for English and German. They found that with sentences over 150 words their alignment method was almost 100 percent successful. With sentences of fewer words this dropped to about 50 percent. Methods based on the statistical bilingual distribution of words (e.g. [Chen, 1993]) encounter a different set of problems. The problem of datasparseness is significant, even in very large corpora such as the Hansard (90 million words) because the size of the lexicon is not bounded (closed) with the number of sentences. This means that there may never be enough evidence about ....

....same word many times in a text. Similarly, a good translator does not always translate a word the same way every time it occurs. We can see that in order to aid statistic based alignment approaches, particularly at the word and phrase level, we need to constrain the text to a sublanguage. Chen [Chen, 1993] overcomes the data sparseness problem for sentence alignment by computing possible word alignments, expressed as probabilities. These probabilities are then thresholded to remove spurious ones. For each word pair in a source and target sentence the probability of their lexical correspondence is ....

Chen, S. (1993). Aligning sentences in bilingual corpora using lexical information. 31st Annual Meeting of the Association of Computational Linguistics, Ohio, USA.


Bilingual Text Matching Using Bilingual Dictionary.. - Utsuro, Ikeda.. (1994)   (7 citations)  (Correct)

....as machine translation. One of the major approaches to analyzing bilingual texts is the statistical approach. The statistical approach involves the following: alignment of bilingual texts at the sentence level using statistical techniques (e.g. Brown, Lai and Mercer (1991) Gale and Church (1993) Chen (1993), and Kay and Roscheisen (1993) statistical machine translation models (e.g. Brown, Cocke, Pietra, Pietra et al. 1990) finding character level word level phrase level correspondences from bilingual texts (e.g. Gale and Church (1991) Church (1993) and Kupiec (1993) and word sense ....

....bilingual dictionaries, and depends solely upon statistics. For example, sentence alignment of bilingual texts are performed just by measuring sentence lengths in words or in characters (Brown et al. 1991; Gale and Church, 1993) or by statistically estimating word level correspondences (Chen, 1993; Kay and Roscheisen, 1993) The statistical approach analyzes unstructured sentences in bilingual texts, and it is claimed that the results are useful enough in real applications such as machine translation and word sense disambiguation. However, structured bilingual sentences are undoubtedly ....

[Article contains additional citation context not shown here]

Chen, S. F. (1993). Aligning sentences in bilingual corpora using lexical information, Proceedings of the 31st Annual Meeting of ACL, pp. 9--16.


A Geometric Approach to Mapping Bitext Correspondence - Melamed (1996)   (11 citations)  (Correct)

....G C91] However, these algorithms can fumble in bitext sections that contain many sentences of very similar length, like this vote record: English French . Mr. McInnis M. McInnis Yes. Oui. Mr. Saunders M. Saunders No. Non. Mr. Cossitt M. Cossitt Yes. Oui. Source: [Che93] The only way to ensure a correct alignment in such regions is to look at the words. For this reason, Chen [Che93] adds a statistical translation model to the Brown et al. alignment algorithm, and Wu [Wu94] adds a translation lexicon to the Gale Church alignment algorithm. A set of points of ....

....like this vote record: English French . Mr. McInnis M. McInnis Yes. Oui. Mr. Saunders M. Saunders No. Non. Mr. Cossitt M. Cossitt Yes. Oui. Source: Che93] The only way to ensure a correct alignment in such regions is to look at the words. For this reason, Chen [Che93] adds a statistical translation model to the Brown et al. alignment algorithm, and Wu [Wu94] adds a translation lexicon to the Gale Church alignment algorithm. A set of points of correspondence leads to alignment more directly than a translation model or a translation lexicon, because points of ....

[Article contains additional citation context not shown here]

S. Chen, "Aligning Sentences in Bilingual Corpora Using Lexical Information," Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, OH, 1993.


Grammarless Extraction of Phrasal Translation Examples from.. - Wu (1995)   (10 citations)  (Correct)

....translation flexibility is strongly restricted if the examples are only at the sentential level. It can now be assumed that a parallel bilingual corpus may be aligned to the sentence level with reasonable accuracy (Kay Roscheisen 1988; Catizone et al. 1989; Gale Church 1991; Brown et al. 1991; Chen 1993), even for languages as disparate as Chinese and English (Wu 1994) Algorithms for subsentential alignment have been developed as well at granularities of the character (Church 1993) word (Dagan et al. 1993; Fung Church 1994; Fung McKeown 1994) collocation (Smadja 1992) and ....

CHEN, STANLEY F. 1993. Aligning sentences in bilingual corpora using lexical information. In Proceedings of the 31st Annual Conference of the Association for Computational Linguistics, 9--16, Columbus, OH.


Iterative Alignment of Syntactic Structures for a Bilingual Corpus - Grishman (1994)   (5 citations)  (Correct)

....valuable information about the source and target languages and about bilingual correspondences. This alignment can be done at several levels. There is already a considerable literature on performing sentence level alignment and identifying word level correspondences (for example, Church 93] Chen 93] and works cited therein) Our own work starts with a corpus which has been aligned at the sentence level, and considers the problem of alignment at the level of regularized syntactic structure a level corresponding approximately to deep structure or the F structure of lexicalfunctional ....

....of role correspondences is then used in the next pass in aligning the text. Analogous iterative algorithms have been described for sentence alignment, in which an initial alignment is used to estimate lexical correspondence probabilities, and these are then used to obtain an improved alignment [Chen 93] Through a series of such iterations, the coverage of the bilingual dictionary and table of role correspondences is gradually increased until a limiting state is reached. This is reflected in gradually improving scores on the parsing metric, as shown in Table 2. We began by using only ....

S. Chen, Aligning sentences in bilingual corpora using lexical information. Proc. 31st Annl. Meeting Assn. Computational Linguistics,, Columbus, Ohio, June 1993, 9-16.


Bilingual Sentence Alignment: Balancing Robustness And Accuracy - Michel Simard (1996)   (4 citations)  (Correct)

....techniques: dictionaries, grammars, semantic networks, stochastic language models, common sense reasoning, intelligent agents you name it. So far, the most promising avenues in dealing with this problem make use of stochastic translation models. For example, to compute sentence alignments, Chen [4] replaces the simple length based models of earlier methods by a more elaborate model that takes into account the words of the text. Dagan et al. [6] use a similar model to obtain word level mappings. To this day, most research on the BCP has focussed on either one of these two problems (robustness ....

....it is not in line with its neighbors (smoothing) A B C D E F A B C D E F Second step: Final Sentence Alignment From this point on, any sentence alignment program that is capable of working within such a restricted search space can be used to finish up the job. Following the ideas of Chen [4] and Dagan et al. 6] we have developed a method that could probably be referred to as heavy artillery in this context: it is based on a statistical lexical translation model, namely Brown et al. s Model 1 [1] Essentially, the model consists in a set of parameters , that estimate the ....

[Article contains additional citation context not shown here]

Chen, Stanley F. (1993), "Aligning Sentences in Bilingual Corpora Using Lexical Information", in Proceedings of ACL-93, Columbus OH.


A Statistical View on Bilingual Lexicon Extraction: From Parallel.. - Fung (1998)   (5 citations)  (Correct)

....the same sentences on both sides. Once the corpus is aligned sentence by sentence, it is possible to learn the mapping between the bilingual words in these sentences. Sometimes lexicon extraction is a by product of alignment algorithms aimed at constructing a statistical translation model [2 4, 12, 23, 32]. Others such as [6, 7] use an EM based model to align words in sentence pairs in order to obtain a technical lexicon. Some other algorithms use sentence aligned parallel texts to further compile a bilingual lexicon of technical words or terms using similarity measures on bilingual lexical pairs ....

Stanley Chen. Aligning sentences in bilingual corpora using lexical information. In Proceedings of the 31st Annual Conference of the Association for Computational Linguistics, pages 9--16, Columbus, Ohio, June 1993.


Learning An English-Chinese Lexicon From A Parallel Corpus - Wu (1994)   (13 citations)  (Correct)

....are relatively scarce compared with monolingual corpora, they have generated more interesting results. Significant progress has been made on problems AMTA 94 Wu Xia 2 including automatic sentence alignment (Kay Roscheisen 1988; Catizone et al. 1989; Gale Church 1991; Brown et al. 1991; Chen 1993), coarse alignment (Church 1993; Fung Church 1994) statistical machine translation (Brown et al. 1990; Brown et al. 1993) word alignment (Dagan et al. 1993) word sense disambiguation (Gale et al. 1993) and collocation learning (Smadja McKeown 1994) all exploiting parallel corpora. To ....

Chen, Stanley F. 1993. Aligning sentences in bilingual corpora using lexical information. In Proceedings of the 31st Annual Conference of the Association for Computational Linguistics, 9--16, Columbus, OH.


Aligning Noisy Parallel Corpora Across Language Groups: Word.. - Fung, McKeown (1994)   (15 citations)  (Correct)

....between word pairs. This algorithm produces a small bilingual lexicon which provides anchor points for alignment. 1 Introduction While muchwork has already been done on the automatic alignment of parallel corpora (Brown et al. 1991; Kay Roscheisen 1993; Gale Church 1993; Church 1993; Chen 1993; Wu 1994) there are several problems which have not been fully addressed by many of these alignment algorithms. First, many corpora are noisy; segments from the source language can be totally missing from the target language or can be substituted with a target language segment which is not a ....

....ignore a chunk of text in either source or target. Second, most previous alignment programs have been developed and tuned for aligning European language pairs. The particular corpora used contain sentence boundaries and these form the anchor points for aligning text (e.g. Brown et al. 1991) and (Chen 1993) use sentence boundaries to align French and English Hansard, while (Kay Roscheisen 1993) aligns English and German Scientific American with sentence boundaries at the sentence level) While there has been some work on aligning English with Asian languages (Wu 1994) it also relies on sentence ....

CHEN, STANLEY. 1993. Aligning sentences in bilingual corpora using lexical information. In Proceedings of the 31st Annual Conference of the Association for Computational Linguistics, 9--16, Columbus, Ohio.


A Pattern Matching Method for Finding Noun and Proper Noun.. - Fung   (6 citations)  (Correct)

....Kumano Hirakawa 1994; Dagan et al. 1993; Wu Xia 1994) all attempt to extract pairs of words or compounds that are translations of each other from previously sentencealigned, parallel texts. However, sentence alignment (Brown et al. 1991; Kay Roscheisen 1993; Gale Church 1993; Church 1993; Chen 1993; Wu 1994) is not always practical when corpora have unclear sentence boundaries or with noisy text segments present in only one language. Our proposed algorithm for bilingual lexicon acquisition bootstraps off of corpus alignment procedures we developed earlier (Fung Church 1994; Fung McKeown ....

Chen, Stanley. 1993. Aligning sentences in bilingual corpora using lexical information. In Proceedings of the 31st Annual Conference of the Association for Computational Linguistics, 9-- 16, Columbus, Ohio.


Semi-Automatic Acquisition of Domain-Specific Translation.. - Resnik, Melamed (1997)   (5 citations)  (Correct)

....bitext maps a few points at a time, by interleaving a point generation phase and a point selection phase. SIMR is equipped with several plug in matching heuristic modules which are based on cognates (Davis et al. 1995; Simard et al. 1992; Melamed, 1995) and or seed translation lexicons (Chen, 1993). Correspondence points are generated using a subset of these matching heuristics; the particular subset depends on the language pair and the available resources. The matching heuristics all work at the word level, which is a happy medium between larger text units like sentences and smaller text ....

S. Chen. 1993. "Aligning Sentences in Bilingual Corpora Using Lexical Information". Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, OH.


Parallel Web Text Mining for Cross-Language IR - Chen, Nie (2000)   (1 citation)  Self-citation (Chen)   (Correct)

....text because of the great di erence between the syntactic structures and writing systems of the two languages. A number of alignment techniques have been proposed, varying from statistical methods (Brown, Lai, and Mercer, 1991; Gale and Church, 1991) to lexical methods (Kay and R oscheisen, 1993; Chen, 1993). What we adopted is the method of Simard et al. 1992) By considering both length similarity and cognateness as alignment criteria, the method is more robust dealing with noises than pure length based methods. Cognates are identical sequences of characters in corresponding words in two ....

Chen, S. F. 1993. Aligning sentences in bilingual corpora using lexical information. In Proceedings of the 31th Annual Meeting of the Association for Computational Linguistics, pages 9-16, Columbus, Ohio.


Aligning Sentences In Bilingual Corpora Using Lexical Information - Chen (1993)   (26 citations)  Self-citation (Chen)   (Correct)

....do not align one to one. Sometimes sentences align many to one, and often there are deletions in one of the supposedly parallel corpora of a bilingual corpus. These deletions can be substantial; in the Canadian Hansard corpus, there are many An abridged version of this paper appears in (Chen, 1993). deletions of several thousand sentences and one deletion of over 90,000 sentences. Previous work includes (Brown et al. 1991b) and (Gale and Church, 1991) In Brown, alignment is based solely on the number of words in each sentence; the actual identities of words are ignored. The general idea ....

Stanley F. Chen. Aligning sentences in bilingual corpora using lexical information. In Proceedings of the 31st Annual Meeting of the ACL, 1993. To appear.


Building Probabilistic Models for Natural Language - Chen (1996)   (18 citations)  Self-citation (Chen)   (Correct)

.... Probabilistic Models for Natural Language A thesis presented by Stanley F. Chen to The Division of Applied Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the subject of Computer Science Harvard University Cambridge, Massachusetts May c fl1996 by Stanley F. Chen All rights reserved. ii Abstract Building models of language is ....

....Probabilistic Models for Natural Language A thesis presented by Stanley F. Chen to The Division of Applied Sciences in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the subject of Computer Science Harvard University Cambridge, Massachusetts May c fl1996 by Stanley F. Chen All rights reserved. ii Abstract Building models of language is a central task in natural language processing. Traditionally, language has been modeled with manually constructed grammars that describe which strings are grammatical and which are not; however, with the recent availability of ....

Stanley F. Chen. 1993. Aligning sentences in bilingual corpora using lexical information.


Clause Alignment for Hong Kong Legal Texts: A.. - Kit, Webster, Sin, Pan, ..   (Correct)

No context found.

S. Chen. Aligning sentences in bilingual corpora using lexical information. In ACL-93, pages 9--13, Columbia, Ohio, 1993.


A Robust Cross-Style Bilingual Sentences Alignment Model - Tz-Liang Kueng Keh-Yih   (Correct)

No context found.

Stanley F. Chen, (1993). "Aligning Sentences in Bilingual Corpora Using Lexical Information ", Proceedings of the 31th Annual Meeting pp. 9-16, 22-26 June 1993, Ohio State University, Columbus, Ohio, USA.


An Alignment Method for Noisy Parallel Corpora based on Image.. - Chang, Chen (1997)   (1 citation)  (Correct)

No context found.

Chen, Stanley F., (1993). Aligning Sentences in Bilingual Corpora Using Lexical Information, In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics (ACL-91), 916, Ohio, USA.


PENS: A Machine-aided English Writing System for Chinese.. - Liu, Zhou, Gao, Xun, Huang   (1 citation)  (Correct)

No context found.

Chen, Stanley F.(1993). Aligning sentences in bilingual corpora using lexical infromation. In Proceedings of the 31 st Annual Conference of the Association for Computational Linguistics, 9-16, Columbus, OH.


Finding Terminology Translations From Non-Parallel Corpora - Fung, McKeown (1997)   (3 citations)  (Correct)

No context found.

Stanley Chen. 1993. Aligning sentences in bilingual corpora using lexical information. In Proceedings of the 31st Annual Conference of the Association for Computational Linguistics, pages 9--16, Columbus, Ohio, June.


A Portable Algorithm for Mapping Bitext Correspondence - Melamed (1997)   (4 citations)  (Correct)

No context found.

S. Chen, "Aligning Sentences in Bilingual Corpora Using Lexical Information," Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, OH, 1993.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC