65 citations found. Retrieving documents...
Brown, Peter, Stephen Della Pietra, Vincent Della Pietra, and Robert Mercer (1991). "WordSense Disambiguation Using Statistical Methods", in Proceedings of ACL91. Berkeley CA.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

An Algorithm For Finding Noun Phrase Correspondences In Bilingual .. - Kupiec (1993)   (25 citations)  (Correct)

....Improvements to the basic algorithm are described, which enable context to be accounted for when constructing the noun phrase mappings. INTRODUCTION Areas of investigation using bilingual corpora have included the following: Automatic sentence alignment [Kay. and RSscheisen, 1988, Brown et al. 1991a, Gale and Church, 1991b] Word sense disambiguation IDagan el al. 1991, Brown et ai. 1991b, Church and Gale, 1901] Extracting word correspondences [Gale and Church, 1991a] Finding bilingual collocations [Smadja, 1992] Estimating parameters for statistically based machine translation ....

....for when constructing the noun phrase mappings. INTRODUCTION Areas of investigation using bilingual corpora have included the following: Automatic sentence alignment [Kay. and RSscheisen, 1988, Brown et al. 1991a, Gale and Church, 1991b] Word sense disambiguation IDagan el al. 1991, Brown et ai. 1991b, Church and Gale, 1901] Extracting word correspondences [Gale and Church, 1991a] Finding bilingual collocations [Smadja, 1992] Estimating parameters for statistically based machine translation [Brown et al. 1992] The work described here makes use of the aligned Canadian Hansards [Gale ....

P. F. Brown, S. A. Della Pietra, V. J. Delia Pietra, and R. L. Mercer. Word sense disambiguation using statistical methods. In Proceedings of the 9th Annual Meeting of the Association of Computational Linguistics, pages 264-270, Berkeley, CA., June 1991.


Word-Sense Disambiguation Using Decomposable Models - Bruce, Wiebe (1994)   (43 citations)  (Correct)

....have sufficient statistics that are lower order marginal distributions. In the future, we will investigate other goodness of fit tests ( 18] 1] 22] that are perhaps more appropriate for sparse data. The Experiment Unlike several previous approaches to word sense disambiguation ( 29] [5], 7] 10] nothing in this approach limits the selection of sense tags to a particular number or type of meaning distinctions. In this study, our goal was to address a non trivial case of ambiguity, but one that would allow some comparison of results with previous work. As a result of these ....

Brown, P., Della Pietra, S., Della Pietra, V., and Mercer, R. (1991). Word Sense Disambigua- tion Using Statistical Methods. Proceedings of the gth Annual Meeting of the Association for Com- putational Linguistics (ACL-gl), pp. 264-304.


Statistical Models for Deep-structure Disambiguation - Chiang, Su (1996)   (Correct)

....achieves disambiguation by using a parameterized model, in which the parameters are estimated and tuned from a training corpus. In such a way, the system can be easily scaled up and well trained based on the well established theories. However, statistical approaches reported in the literature [4,5,6,7] usually use only surface level information, e.g. collocations and word associations, without taking structure information, such as syntax and thematic role, into consideration. In general, the structure features that 113 characterize long distance dependency, can provide more relevant ....

Peter F. Brown, Stephen A. Della Pietra, Vicent J. Della Pietra, and Robert L. Mercer. "Wordsense disambiguation using statistical methods." In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, pages 264-270, 1992.


Aligning Sentences In Bilingual Corpora Using Lexical Information - Chen (1993)   (26 citations)  (Correct)

....Harvard University Cambridge, MA 02138 Internet: sfc calliope.harvard.edu Abstract In this paper, we describe a fast algorithm for aligning sentences with their translations in a bilingual corpus. Existing efficient algorithms ig nore word identities and only consider sentence length (Brown et al. 1991b; Gale and Church, 1991) Our algorithm constructs a simple statisti cal word to word translation model on the fly during alignment. We find the alignment that maximizes the probability of generating the corpus with this translation model. We have achieved an error rate of approximately 0.4 on ....

....indepen dent. I Introduction In this paper, we describe an algorithm for aligning sentences with their translations in a bilingual corpus. Aligned bilingual corpora have proved useful in many tasks, including machine transla tion (Brown et al. 1990; Sadlet, 1989) sense disambiguation (Brown et al. 1991a; Dagan et al. 1991; Gale et al. 1992) and bilingual lexicography (Klavans and Tzoukermann, 1990; Warwick and Russell, 1990) The task is difficult because sentences frequently do not align one to one. Sometimes sentences align many to one, and often there are deletions in The author wishes ....

[Article contains additional citation context not shown here]

Peter F. Brown, Stephen A. DellaPietra, Vincent J. DellaPietra, and Robert L. Mercer. Word sense disambiguation using statistical methods. In Proceedings 9th Annu- al Meeting of the ACL, pages 265-270, Berkeley, CA, June 1991.


Information Retrieval Based on Word Senses - Schütze, Pedersen (1995)   (1 citation)  (Correct)

.... Kelly and Stone [22] consider hand constructed disambiguation rules, Lesk [27] Krovetz and Croft [24] and Guthrie et al. 15] use online dictionaries, I Iirst [20] constructs knowledge bases, Cottrell [4] uses syntactic and semantic struc ture encoded in a connectionist net, Brown et al. [1] and Church and Gale [3] exploit bilingual corpora, Dagan et al. 7] use a bilingual dictionary, Hearst [lSl aria t, eacock et al. 261 exploit a hand labeled training set, and Yarowsky [41] performs a computation based on Roget s the saurus. McRoy [28] investigates how multiple knowledge sources ....

Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. Word-sense disambiguation using statistical methods. In Proceedings of A CL 9, 1991.


A Probabilistic Approach to Lexical Semantic Knowledge Acquisition.. - Li (1998)   (Correct)

....fashion. This algorithm has the advantage of being able to handle a large set of features, and at the same time not ordinarily be a#ected by features that are irrelevant to the disambiguation decision. See (Golding and Roth, 1996) For word sense disambiguation methods, see also (Black, 1988; Brown et al. 1991; Guthrie et al. 1991; Gale, Church, and Yarowsky, 1992; McRoy, 1992; Leacock, Towell, and Voorhees, 1993; Yarowsky, 1993; Bruce and Wiebe, 1994; Niwa and Nitta, 1994; Voorhees, Leacock, and Towell, 1995; Yarowsky, 1995; Golding and Schabes, 1996; Ng and Lee, 1996; Fujii et al. 1996; Schutze, ....

Brown, Peter, Stephen Della Pietra, Vincent Della Pietra, and Robert Mercer. 1991. Word sense disambiguation using statistical methods. Proceedings of Annual the 29th Meeting of the Association for Computational Linguistics, pages 264--270.


Word-Sense Disambiguation Using Decomposable Models - Bruce, Wiebe (1994)   (43 citations)  (Correct)

....have sufficient statistics that are lower order marginal distributions. In the furture, we will investigate other goodness of fit tests ( 18] 1] 22] that are perhaps more appropriate for sparse data. The Experiment Unlike several previous approaches to word sense disam biguation ( 29] [5], 7] 10] nothing in this approach limits the selection of sense tags to a particular number or type of meaning distinctions. In this study, our goal was to address a non trivial case of ambiguity, but one that would allow some comparison of results with previous work. As a result of these ....

....Previous Work Many researches have avoided characterizing the interactions among multiple contextual features by considering only one feature in determining the sense of an biguous word. Techniques for identifving the optimum feature to use in disambiguating a word are presented in [7] 30] and [5]. Other works consider multiple con textual features in performing disambiguation without formally characterizing the relationships among the fea tures. The majority of these efforts ( 13] 31] weight each feature in predicting the sense of an ambiguous word in accordance with frequency ....

Brown, P., Della Pietra, S., Della Pietra, V., and Mercer, R. (1991). Word Sense Disambigua- tion Using Statistical Methods. Proceedings of the 29th Annual Meeting of the Association for Con,- putational Linguistics (A CL-91), pp. 264-304.


Experiments in Multilingual Information Retrieval - Hull, Grefenstette (1996)   (10 citations)  (Correct)

....same collection is being used for both disambiguation and retrieval, so domain relevance of the filtering process is guaranteed. In this light, the translation disambiguation problem bears a strong resemblence to term disambiguation in a 18 monolingual setting. In fact, a number of researchers [8, 3] have used cross language relationships to help with disambiguation. Given the limited success of term disambiguation as a tool for IR [23] there is some question about whether we can hope to gain any benefits out of translation filtering. Translation disambiguation may well work better as an ....

P.F. Brown, S.A. Della Pietra, V.J. Della Pietra, and R.L. Mercer. Word-sense disambiguation using statistical methods. In Proc. of the Association for Computational Linguistics (ACL), pages 169--176, 1991.


Collocations - McKeown, Radev (2000)   (Correct)

....idea that the presence of certain words near the ambiguous one will be a good indicator of its most likely sense. The second type of constraint can be obtained when pairs of translations of the word in an aligned bilingual corpus are considered. Research performed at IBM in the early 1990s [7] applied a statistical method using as parameters the context in which the ambiguous word appeared. Seven factors were considered: the words immediately to the left or to the right to the ambiguous word, the first noun and the first verb both to the left and to the right, as well as the tense of ....

Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. Word-sense disambiguation using statistical methods. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, pages 264--270, Berkeley, California, 1991.


A Syntax-based Statistical Translation Model - Yamada, Knight (2001)   (17 citations)  (Correct)

.... used for statistical machine translation (Berger et al. 1996) word alignment of a translation corpus (Melamed, 2000) multilingual document retrieval (Franz et al. 1999) automatic dictionary construction (Resnik and Melamed, 1997) and data preparation for word sense disambiguation programs (Brown et al. 1991). Developing a better TM is a fundamental issue for those applications. Researchers at IBM first described such a statistical TM in (Brown et al. 1988) Their models are based on a string to string noisy channel model. The channel converts a sequence of words in one language (such as English) ....

P. Brown, J. Cocke, S. Della Pietra, F. Jelinek, R. Mercer, and P. Roossin. 1991. Word-sense disambiguation using statistical methods. In ACL-91.


Word Sense Disambiguation Using Statistical Techniques - Chin (1999)   (Correct)

....recently reawakened and is found to be effective. This type of approach overcomes the limitation of syntactic knowledge mentioned above while avoiding the complexity of modelling and compiling semantic and pragmatic knowledge. Yarowsky, 1992) Dagan and Itai, 1994) Justeson and Katz, 1995) and (Brown et al. 1991) have show that this approach is feasible and gives reasonable results even in unconstrained input text from a broad domain. The de nition of word sense ambiguities will be discussed in section 2. Section 3 will outline di erent aspects of the statistical approach to WSD. The methods proposed by ....

....and gives reasonable results even in unconstrained input text from a broad domain. The de nition of word sense ambiguities will be discussed in section 2. Section 3 will outline di erent aspects of the statistical approach to WSD. The methods proposed by (Dagan and Itai, 1994) Yarowsky, 1992) (Brown et al. 1991) will be brie y described in sections 4 6. This is followed by a comparative analysis of all these methods in section 7. 2 Word Sense Ambiguities Words in natural language (or lexical in linguistic terminology) can be polysemous. Consequencely, Natural Language Understanding and Information ....

[Article contains additional citation context not shown here]

Brown, P., Della Pietra, S., Della Pietra, V., and Mercer, R. (1991). Word sense disambiguation using statistical methods. Proceedings, Annual Meeting of the Association for Computational Linguistics, pages 264-270.


A New Approach to Word Sense Disambiguation - Bruce, Wiebe (1994)   (3 citations)  (Correct)

....interactions among variables. To date, much of the work in statistical NLP has focused on parameter estimation ( 11] 13] 12] 4] Of the research directed toward identifying the optimum form of model, most has been concerned with the selection of individually informative features ([2], 5] with relatively little attention directed toward the identification of an optimum approximation to the joint distribution of the values of the contextual features and object classes. Most previous efforts to formulate a probabilistic classifier for word sense disambiguation did not attempt ....

....classifier for word sense disambiguation did not attempt to systematically identify the interdependencies among contextual features that can be used to classify the meaning of an ambiguous word. Many researchers have performed disambiguation on the basis of only a single feature ( 6] 15] [2]) while others who do consider multiple contextual features assume that all contextual features are either conditionally independent given the sense of the word ( 8] 14] or fully independent ( 10] 16] In earlier work, we describe a method for identifying an appropriate model for use in ....

[Article contains additional citation context not shown here]

Brown, P.; Della Pietra, S.; Della Pietra, V.; and Mercer, R. (1991). Word Sense Disambiguation Using Statistical Methods. Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics (ACL91) , pp. 264-304.


Disambiguation by Association as a Practical Method.. - Sutcliffe, Slater (1995)   (1 citation)  (Correct)

.... than inferred from keyword co occurrence (Sussna, 1993, Voorhees, 1993, Richardson, 1994) These three authors all used the WordNet lexical database (Beckwith, Fellbaum, Gross and Miller, 1992) In recent years there has been much interest in statistical word sense disambiguation using corpora (Brown, Della Pietra, Della Pietra and Mercer, 1991; Dagan, Itai and Schwall, 1991; Zernik, 1991; Yarowsky, 1992; Gale, Church and Yarowsky, 1993; Schutze, 1993) In general, these methods gather statistical co occurrence data for each word by examining a large number of contexts obtained from one or more text corpora. As a result it is possible ....

Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., & Mercer, R. L. (1991). Word-Sense Disambiguation Using Statistical Methods. Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, California, 18-21 June, 1991, 264-270.


Machine Learning and Natural Language Processing - Marquez (2000)   (1 citation)  (Correct)

.... of the dependent variable in the future) Their application to NLP is also noticeable, and we find tree based solutions to address natural language ambiguity problems at several levels: Speech recognition [8, 9] PoS tagging [200, 132, 140, 141, 164, 136, 138, 137] word sense disambiguation [26], parsing [132, 92] text categorization [117, 69, 230] text summarization [134] dialogue act tagging [192] co reference resolution [5, 143] cue phrase identification [122] and machine translation (verb classification) 221, 209] In Magerman s approach [132] decision trees are used for a ....

.... NB NNs LSM EC Clause Boudaries [95] Shallow Parsing [18] 128, 129] 130] 223] Parsing [115, 43] 93] PP attachment disambiguation [20] 52] 125, 218] 107] 2] Table 2: References corresponding to syntactic analysis and structural ambiguity NLP problems DLs DTs NB TBL EM WSD [240, 150] [26, 150] [86, 150, 112] 67] 203, 166] Text categorization and filtering [117, 69, 230] 117, 190, 119, 142, 196] 162, 163] Dialogue act tagging [192] 193, 192] Co reference and anaphora resolution [5, 143] Cue phrase identification [122] IBL NNs EC SVM Clust WSD [159, 157, 84, 73] ....

P. F. Brown, S. Della Pietra, V. Della Pietra, and R. L. Mercer. Word Sense Disambiguation using Statistical Methods. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, ACL, pages 264--270, 1991.


A Stochastic Model Of Intonation For Text-To-Speech.. - Veronis, Di Cristo.. (1998)   (3 citations)  (Correct)

.... can be used to develop tools capable of carrying out a number of linguistic engineering tasks like speech recognition (Baker, 1975; Jelinek, 1976; Rabiner, 1995) grammatical labelling (Bahl and Mercer, 1976; Debili, 1977; Leech et al. 1983) lexical disambiguation (Choueka and Lusignan 1985; Brown et al. 1991; Gale et al. 1993) lexical or terminological extraction (Choueka et al. 1983; Church and Hanks, 1990; Daille et al. 1994) and even albeit arguably less successfully automatic translation (Brown et al. 1990) Most probabilistic models can be stated in information theory terms ....

Brown P., Della Pietra S., Della Pietra V., Mercer R. (1991), Word sense disambiguation using statistical methods. Proceedings of the 29th Annual Meeting of Association for Computational Linguistics, Berkeley, California, 264-270.


Statistical Machine Translation - Al-Onaizan, Curin, Jahr, Knight.. (1999)   (Correct)

....better translation models, and possibly traveling further up into the realms of syntax and semantics. We will not anticipate them here. However, we know of certain enhancements that would improve the immediate usefulness of Egypt, so we list these: ffl Context dependent translation probabilities. [Brown et al. 1991; Berger et al. 1996; Melamed, 1998a] describe techniques for assigning senses to words and or translating them differently based on context. This would relieve substantial amounts of pressure currently on the language model to disambiguate words in the input text. ffl Faster training. We have ....

Brown, P., S. Della Pietra, V. Della Pietra, and R. Mercer. 1991. Word-sense disambiguation using statistical methods. In Proc. ACL.


Estimating Word Translation Probabilities from Unrelated.. - Koehn, Knight (2000)   (3 citations)  (Correct)

....bottleneck. 3. Translation Probabilities We describe an approach that uses a monolingual corpus in the target language to estimate word translation probabilities. These take the form pw (f je) the overall probability that the English word e will be translated as f , regardless of context 4 . Brown et al. 1991] show how to estimate pw (f je) parameters from a bilingual corpus. Since the translation probabilities cannot be observed directly in non parallel corpora, one simple idea is to 4 We follow here the usual notation of translating a foreign word f to an English word e. use the frequencies of the ....

Brown, P. F., Della-Pietra, S., Della-Pietra, V., and Mercer, R. (1991). Word-sense disambiguation using statistical methods. In Proceedings of ACL 29.


Corpus-Based Method for Unsupervised Word Sense Disambiguation - Levinson (1999)   (1 citation)  (Correct)

....Moreover, these methods are suitable only for morphological analysis. There are a number of corpus based approaches to the problem of disambiguation within the same part of speech. However, most of them use additional language specific data, such as thesauri (Yarowsky, 1992) bilingual corpora (Brown et al. 1991# Gale et al. 1993) monolingual (Lesk, 1986) and bilingual dictionaries (Dagan and Itai, 1994) In order to apply these method to some language, the required information for that language has to be obtained before. The method we present is unsupervised, and the only information required is a ....

Brown, P. F., Della Pietra, S., Della Pietra, V. J., and Mercer, R. L. (1991). Word sense disambiguation using statistical methods. In Proceedings of the 29th Annual Meeting of the ACL, pages 264--270.


Automatic Word Sense Discrimination - Schütze (1998)   (13 citations)  (Correct)

....of the task, an outside source of knowledge is necessary to define the senses. Regardless of whether it takes the form of dictionaries (Lesk, 1986; Guthrie et al. 1991; Dagan, Itai, and Schwall, 1991; Karov and Edelman, 1996) thesauri (Yarowsky, 1992; Walker and Amsler, 1986) bilingual corpora (Brown et al. 1991; Church and Gale, 1991) or hand labeled training sets (Hearst, 1991; Leacock, Towell, and Voorhees, 1993a; Niwa and Nitta, 1994; Bruce and Wiebe, 1994) providing information for sense definitions can be a considerable burden. What makes our approach unique is that since we narrow the problem to ....

.... (1975) consider hand constructed disambiguation rules, Lesk (1986) Krovetz and Croft (1989) Guthrie et al. 1991) and Karov and Edelman (1996) use online dictionaries, Hirst (1987) constructs knowledge bases, Cottrell (1989) uses syntactic and semantic structure encoded in a connectionist net, Brown et al. 1991) and Church and Gale (1991) exploit bilingual corpora, Dagan et al. 1991) use a bilingual dictionary, Hearst (1991) Leacock et al. 1993a) Niwa and Nitta (1994) and Bruce and Wiebe (1994) exploit a hand labeled training set, and Yarowsky (1992) and Walker and Amsler (1986) perform computations ....

Brown, Peter F., Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1991. Word-sense disambiguation using statistical methods. In Proceedings of ACL 29, pages 264--270, Berkeley CA.


Context Filters for Document-Based Information Filtering - Murthy, Keerthi (1999)   (Correct)

....processing there is a significant interest in word sense disambiguation. This interest has led to the development of several word sense disambiguation systems. These systems are either knowledgebase based [Hirst, Small, Wilks] or lexicon based [Li, Sussna, Voorhees] or corpus (statistics) based [Brown, Pedersen1, Pedersen2, Yarowsky] methods. In a later section we explain how the scheme proposed in this paper is different from these methods and why it is more suitable for information filtering than these methods. In this paper we describe, in detail, a few factors of uncertainty in information filtering. Then we propose a ....

Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Mercer, R.L., Word-Sense Disambiguation using Statistical Methods, In Proc. ACL meeting, Berkeley 1991, 264-270.


Word-Sense Disambiguation Using Decomposable Models - Bruce, Wiebe (1994)   (43 citations)  (Correct)

....have sufficient statistics that are lower order marginal distributions. In the furture, we will investigate other goodness of fit tests ( 18] 1] 22] that are perhaps more appropriate for sparse data. The Experiment Unlike several previous approaches to word sense disambiguation ( 29] [5], 7] 10] nothing in this approach limits the selection of sense tags to a particular number or type of meaning distinctions. In this study, our goal was to address a non trivial case of ambiguity, but one that would allow some comparison of results with previous work. As a result of these ....

....Work Many researches have avoided characterizing the interactions among multiple contextual features by considering only one feature in determining the sense of an ambiguous word. Techniques for identifying the optimum feature to use in disambiguating a word are presented in [7] 30] and [5]. Other works consider multiple contextual features in performing disambiguation without formally characterizing the relationships among the features. The majority of these efforts ( 13] 31] weight each feature in predicting the sense of an ambiguous word in accordance with frequency ....

Brown, P., Della Pietra, S., Della Pietra, V., and Mercer, R. (1991). Word Sense Disambiguation Using Statistical Methods. Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics (ACL-91), pp. 264-304.


Context Filters for Document-Based Information Filtering - Murthy, Keerthi (1999)   (Correct)

....to some uncertainty in the filtering process. In NLP there is a significant interest in word sense disambiguation. This interest has led to the development of several word sense disambiguation systems. These are either knowledge base based [2,9] or lexicon based [3,8] or corpus (statistics) based [4,7] methods. Section4 explains how the scheme proposed in this paper is better suited than the existing ones for information filtering. In this paper we first describe a few factors of uncertainty in information filtering. Then we propose a methodology called context filters to reduce the ....

Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., Mercer, R.L., Word-Sense Disambiguation using Statistical Methods, In Proc. ACL meeting, Berkeley, 264270, 1991.


A Corpus-Based Approach to Language Learning - Brill (1993)   (86 citations)  (Correct)

....and the learning process is only weakly nonsymbolic. 3.4 Other Areas In this chapter, we have only touched upon a few of the many research programs based on extracting various sorts of linguistic information from corpora. Other areas include machine translation [27] word sense disambiguation [28], word clustering [22, 15, 29, 89] and pronoun resolution [30] Chapter 4 Transformation Based Error Driven Learning Applied to Natural Language 4.1 Introduction In this section, we describe a framework for learning which has been effectively applied to a number of language learning problems. ....

P. Brown, J. Lai, and R. Mercer. Word-sense disambiguation using statistical methods. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, Ca., 1991.


Learning And Generalization In The Creation Of Information.. - Chai (1998)   (Correct)

....for each of the senses. This representation can be used in new instances. The set of features is important in this approach. Different researchers have made use of different sets of features. For example, local collations such as first and second words to the left and right are used in [14]. A more common feature set which takes all the words in a window of words around the ambiguous words is used in [39] treating the context as an unordered bag of words. The heart of the problem, however, as is often the case in the corpus based approach, is obtaining sufficient training data. ....

P. Brown, S. Della Pietra, V. Della Pietra, and R. Mercer. Word sense disambiguation using statistical methods. In Proceedings of the 29th Meeting of the Association for Computational Linguistics, 1991.


Senses and Texts - Wilks (1997)   (2 citations)  (Correct)

....we need in real situations. This suggestion is rather different from Kilgarriff s conclusion: which is also an empirical one. He proposes that the real basis of sense distinction be established by usage clustering techniques applied to corpora. This is an excellent idea and recent work at IBM (Brown et al. 1991) has produced striking non seeded clusters of corpus usages, many of them displaying a similarity close to an intuitive notion of sense. But there are serious problems in moving any kind of lexicography, traditional or computational, onto any such basis. Hanks (1994) has claimed that a dictionary ....

Brown, P.F., Di Pietra, S.A., Di Pietra, V.J. and Mercer, R.L. (1991) Word sense disambiguation using statistical methods, Proc. ACL-91.


Aligning Sentences In Bilingual Corpora Using Lexical Information - Chen (1993)   (26 citations)  (Correct)

....Sciences Harvard University Cambridge, MA 02138 Internet: sfc calliope.harvard.edu Abstract In this paper, we describe a fast algorithm for aligning sentences with their translations in a bilingual corpus. Existing efficient algorithms ignore word identities and only consider sentence length (Brown et al. 1991b; Gale and Church, 1991) Our algorithm constructs a simple statistical word to word translation model on the fly during alignment. We find the alignment that maximizes the probability of generating the corpus with this translation model. We have achieved an error rate of approximately 0.4 on ....

....is language independent. 1 Introduction In this paper, we describe an algorithm for aligning sentences with their translations in a bilingual corpus. Aligned bilingual corpora have proved useful in many tasks, including machine translation (Brown et al. 1990; Sadler, 1989) sense disambiguation (Brown et al. 1991a; Dagan et al. 1991; Gale et al. 1992) and bilingual lexicography (Klavans and Tzoukermann, 1990; Warwick and Russell, 1990) The task is difficult because sentences frequently do not align one to one. Sometimes sentences align many to one, and often there are deletions in one of the ....

[Article contains additional citation context not shown here]

Peter F. Brown, Stephen A. DellaPietra, Vincent J. DellaPietra, and Robert L. Mercer. Word sense disambiguation using statistical methods. In Proceedings 29th Annual Meeting of the ACL, pages 265-- 270, Berkeley, CA, June 1991.


Combining Independent Knowledge Sources for Word Sense.. - Wilks, Stevenson (1997)   (4 citations)  (Correct)

....WSD using information gained from training on some corpus. This approach can be further subclassified: a) Tagged corpora Information is gathered from corpora which have already been semantically disambiguated. This tagging need not be an explicit sense tag, the bilingual corpus used by Brown (Brown et al. 1991) to train a statistical WSD algorithm were disambiguated for the purposes of that approach. Training text tagged with explicit senses from a lexicon has been used by (Bruce and Wiebe, 1994) and (Ng and Lee, 1996) b) Untagged corpora Information is gathered from raw corpora which has not been ....

Brown, P., S. Della Pietra, V. Della Pietra, and R. Mercer. 1991. Word sense disambiguation using statistical methods. In Proceedings of the 29th Meeting of the Association for Computational Linguistics, pages 264--270, Berkley, C.A.


A WordNet-based Algorithm for Word Sense Disambiguation - Li, Szpakowicz, Matwin (1995)   (15 citations)  (Correct)

....is essential in natural language processing. Early symbolic methods [Hirst, 1987; Small and Rieger, 1982; Wilks, 1975] heavily rely on large amounts of hand crafted knowledge. As a result, they can only work in a specific domain. To overcome this weakness, a variety of statistical WSD methods [Brown et al. 1991; Gale et al. 1992; Resnik, 1992; Schutze, 1992; Charniak, 1993; Lehman, 1994] have been put forward. They scale up easily and this makes them useful for large, unrestricted corpora. One of the most important steps in statistical WSD methods, however, is statistically motivated extraction of ....

Brown, P. F., S. A. Della Pietra, V. J. Della Pietra and R. L. Mercer, "Word-Sense Disambiguation Using Statistical Methods", In Proc ACL Meeting, Berkeley 1991, 264-270.


Methods of Category Classification Applied to Word-Sense.. - Wiebe, Bruce (1996)   (Correct)

....Other Work Many researches have avoided characterizing the interactions among multiple contextual features by considering only one feature in determining the sense of an ambiguous word. Techniques for identifying the optimum feature to use in disambiguating a word are presented in [22] 73] and [10]. Other works consider multiple contextual features in performing disambiguation without formally characterizing the relationships among the features. The majority of these efforts [34] 76] weight each feature in predicting the sense of an ambiguous word in accordance with frequency information, ....

Brown, P., Della Pietra, S., Della Pietra, V., and Mercer, R. (1991). Word Sense Disambiguation Using Statistical Methods. Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics (ACL-91), pp. 264-304.


Aligning Sentences In Bilingual Corpora Using Lexical Information - Chen (1993)   (26 citations)  (Correct)

....Sciences Harvard University Cambridge, MA 02138 Internet: sfc calliope.harvard.edu Abstract In this paper, we describe a fast algorithm for aligning sentences with their translations in a bilingual corpus. Existing efficient algorithms ignore word identities and only consider sentence length (Brown et al. 1991b; Gale and Church, 1991) Our algorithm constructs a simple statistical word to word translation model on the fly during alignment. We find the alignment that maximizes the probability of generating the corpus with this translation model. We have achieved an error rate of approximately 0.4 on ....

....is language independent. 1 Introduction In this paper, we describe an algorithm for aligning sentences with their translations in a bilingual corpus. Aligned bilingual corpora have proved useful in many tasks, including machine translation (Brown et al. 1990; Sadler, 1989) sense disambiguation (Brown et al. 1991a; Dagan et al. 1991; Gale et al. 1992) and bilingual lexicography (Klavans and Tzoukermann, 1990; Warwick and Russell, 1990) The task is difficult because sentences frequently do not align one to one. Sometimes sentences align many to one, and often there are deletions in one of the ....

[Article contains additional citation context not shown here]

Peter F. Brown, Stephen A. DellaPietra, Vincent J. DellaPietra, and Robert L. Mercer. Word sense disambiguation using statistical methods. In Proceedings 29th Annual Meeting of the ACL, pages 265--270, Berkeley, CA, June 1991.


Senses and Texts - Wilks (1997)   (2 citations)  (Correct)

....we need in real situations. This suggestion is rather different from Kilgarriff s conclusion: which is also an empirical one. He proposes that the real basis of sense distinction be established by usage clustering techniques applied to corpora. This is an excellent idea and recent work at IBM [2] has produced striking non seeded clusters of corpus usages, many of them displaying a similarity close to an intuitive notion of sense. But there are serious problems in moving any kind of lexicography, traditional or computational, onto any such basis. Hanks [10] has claimed that a dictionary ....

P. F. Brown, S. A. Di Pietra, V. J. Di Pietra, and R. L. Mercer. Word sense disambiguation using statistical methods. In Proc. ACL-91, 1991.


Computational Tools and Resources for Linguistic Studies - Hsu, Chang, Su   (Correct)

....= 0.69 CNT( 3 H LNC( 1 H RNC( 1 CNT( 3 H LNC( 0 H RNC( 1 Figure 11 The set of left and right neighboring characters of four strings and their corresponding entropies [Tung 1994] 2. 4 Automatic Comparison of Parallel Texts Parallel corpora, such as the Hansards corpus [Brown 1991a, Gale 1991a] are very useful knowledge sources for automatic acquisition of bi lingual (and monolingual) knowledge. In the field of computational linguistics, a variety of researches have investigated the use of bilingual corpora, including sentence alignment [Wu 1994] word correspondence [Dagan 1993] ....

....bi lingual (and monolingual) knowledge. In the field of computational linguistics, a variety of researches have investigated the use of bilingual corpora, including sentence alignment [Wu 1994] word correspondence [Dagan 1993] collocation correspondence [Smadja 1996] word sense disambiguation [Brown 1991b, Dagan 1991, Gale 1992] and machine translation [Brown 1990, Su 1995, Wu 1995] Bi lingual material is also valuable for linguists who are interested in comparative studies of different languages, or in doing bi lingual lexicography for translation. Here, we introduce three kinds of techniques ....

Brown, P. et al., "Word-Sense Disambiguation Using Statistical Methods," Proceedings of ACL-29, pp. 264-270, California, USA, June 18-21, 1991.


Unsupervised Word Sense Disambiguation Rivaling Supervised Methods - Yarowsky (1995)   (106 citations)  (Correct)

....al. 1993) Bruce and Wiebe (1994) and Lehman (1994) as it does not require costly hand tagged training sets. It thrives on raw, unannotated monolingual corpora the more the merrier. Although there is some hope from using aligned bilingual corpora as training data for supervised algorithms (Brown et al. 1991), this approach suffers from both the limited availability of such corpora, and the frequent failure of bilingual translation differences to model monolingual sense differences. The use of dictionary definitions as an optional seed for the unsupervised algorithm stems from a long history of ....

Brown, Peter, Stephen Della Pietra, Vincent Della Pietra, and Robert Mercer, "Word Sense Disambiguation using Statistical Methods," Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, pp 264-270, 1991.


An Ontological-Semantic Framework for Text Analysis - Onyshkevych (1997)   (1 citation)  (Correct)

No context found.

Brown, Peter, Stephen Della Pietra, Vincent Della Pietra, and Robert Mercer (1991). "WordSense Disambiguation Using Statistical Methods", in Proceedings of ACL91. Berkeley CA.


Analysis of Statistical Question Classification for.. - Metzler, Croft (2004)   (Correct)

No context found.

Brown, P. F., S. D. Pietra, V. J. D. Pietra, and R. L. Mercer: 1991, `Word-Sense Disambiguation Using Statistical Methods'. In: Meeting of the Association for Computational Linguistics. pp. 264--270.


An Automatic Approach to Create a Sense Tagged Corpus for .. - Disambiguation In Machine (2005)   (Correct)

No context found.

P. F. Brown, S. A. Della Pietra, V.J Della Pietra, R. L. Mercer. 1991. Word Sense Disambiguation Using Statistical Methods. In Proceedings of the 29th Annual Meeting of ALC, pages 264-270.


Exploiting Parallel Texts to Produce a.. - Specia.. (2005)   (Correct)

No context found.

P.F. Brown, S.A. Della Pietra, V.J Della Pietra, R.L. Mercer. Word Sense Disambiguation Using Statistical Methods. In Proceedings of the 29th Annual Meeting of ALC, pp. 264-270, 1991.


A Hybrid Model for Word Sense Disambiguation in.. - Machine Translation Lucia (2005)   (Correct)

No context found.

Brown P., Della Pietra S., Della Pietra V. and Mercer R. (1991) Word sense disambiguation using statistical methods. In Proceedings of the 29th Annual Meeting of ACL, Berkeley, pp. 264-270.


Comparative Study of Statistical Word Sense - Discrimination Techniques..   (Correct)

No context found.

Brown, P., Della Pietra, S., Della Pietra, V., and Mercer, R. (1991). Word-sense disambiguation using statistical method. ACL, 29, 139--145.


Disambiguating Proteins, Genes, and RNA in Text: A.. - Hatzivassiloglou.. (2001)   (Correct)

No context found.

Brown, P. F., S. A. della Pietra, V. J. della Pietra, and R. L. Mercer (1991). Word-sense disambiguation using statistical methods. In Proc. 29th ACL, Berkeley, California, pp. 264--270.


Discovering Entailment Relations Using "textual Entailment.. - Zanzotto, al. (2005)   (Correct)

No context found.

Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. 1991. Word-sense disambiguation using statistical methods. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics (ACL), Berkely, CA.


Predicate Preserving Parsing - Parikh, Khot, Dave, Bhattacharyya (2004)   (Correct)

No context found.

Brown, Peter F., Pietra, Stephen A., Della, Pietra, Della, Vincent J., and Mercer, Robert L., 1991. Word-Sense Disambiguation Using Statistical Methods. In Proceedings of the 29th Conference of the Association for Computational Linguistics, pp. 264-270, Berkeley, CA.


Binary Feature Classification for Word - Disambiguation In Statistical   (Correct)

No context found.

Brown, P.F., Della Pietra, S., Della Pietra, V. & Mercer, R.L., 1991 Word Sense Disambiguation Using Statistical Methods. Proc. 29 th Annual Meeting of the Association for Computational Linguistics, (pp.265---270), Berkeley, USA.


Decision Lists For Lexical Ambiguity Resolution: Application to.. - Yarowsky (1994)   (55 citations)  (Correct)

No context found.

Brown, Peter, Stephen Della Pietra, Vincent Della Pietra, and Robert Mercer, "Word Sense Disambiguation using Statistical Methods," Proceedings of the 9th Annual Meeting of the Association for Computational Linguistics, pp. 264-270, 1991.


Structural Disambiguation Based on Reliable Estimation of.. - Wu, Alves, Furugori (1998)   (1 citation)  (Correct)

No context found.

Brown, P., Della Pietra, V. and Mercer, R. (1992). "Word Sense Disambiguation Using Statistical Methods." Proceedings of the 30th ACL, pages 264-270.


Word-Sense Disambiguation Using Statistical Models of Roget's.. - Yarowsky (1992)   (144 citations)  (Correct)

No context found.

P. Brown, S. Della Pietra, V. Della Pietra, and R. Mercer. Word sense disambiguation using sta- tistical methods. In Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, pp. 264-270, Berkeley, 1991.


Measuring Semantic Entropy - Melamed (1997)   (2 citations)  (Correct)

No context found.

P. F. Brown, S. Della Pietra, V. Della Pietra, R. Mercer, "Word Sense Disambiguation using Statistical Methods", Proceedings of the gth Annual Meeting of the Association for Computational Linguistics, Berkeley, Ca., 1991.


Corpus-based Techniques for Word Sense Disambiguation - Levow (1997)   (2 citations)  (Correct)

No context found.

Peter F. Brown, Stephen A. DellaPietra, Vincent J. DellaPietra, and Robert L. Mercer. Word sense disambiguation using statistical methods. In Proceedings 29th Annual Meeting of the Association for Computational Linguistics, pages 265--270, Berkeley, CA, June 1991.


Paradigm Merger in Natural Language Processing - Gazdar (1996)   (9 citations)  (Correct)

No context found.

Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, & Robert Mercer (1991). Word-sense disambiguation using statistical methods, in Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, 264-270.


Decision Lists For Lexical Ambiguity Resolution: Application to.. - Yarowsky (1994)   (55 citations)  (Correct)

No context found.

Brown, Peter, Stephen Della Pietra, Vincent Della Pietra, and Robert Mercer, "Word Sense Disambiguation using Statistical Methods," Proceedings of the 29th Annual Meeting of the Association for Computational Linguistics, pp. 264-270, 1991.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC