| W. Gale, K. Church, and D. Yarowsky. A method for disambiguating word senses in a large corpus. Computers and Humanities, 26(5/6):415--450, 1992. |
....database for words that appeared in an identical local context as w. He calls these words the selectors of w. This allows him to choose a meaning for w that maximizes the similarity between w and its selectors. Using the one sense per discourse heuristic advocated by Gale, Church and Yarowsky [7], the sense chosen is assigned to all occurrences of w in the input text. 15 Evaluating WSD Algorithms As pointed out by Ng and Zelle [16] the evaluation of empirical, corpus based WSD has not been as rigorously pursued as other areas of corpus based NLP, such as part of speech tagging and ....
....and 70 verbs in English. The last three corpora are publicly available from New Mexico State University, Princeton University, and Linguistic Data Consortium (LDC) respectively. A baseline called the most frequent heuristic has been proposed as a performance measure for the WSD algorithms [7]. This heuristic simply chooses the most frequent sense of the word w and assigns it as the sense of w in test sentences without considering the effect of context. A WSD algorithm must perform better than the most frequent heu ristic to be of any significant value. The success of a WSD algorithm ....
W. Gale, Church and D. Yarowsky 1992. A method for disambiguating word senses in a large corpus. Computers and Humanities, 26:415-439.
....sufficient statistics that are lower order marginal distributions. In the future, we will investigate other goodness of fit tests ( 18] 1] 22] that are perhaps more appropriate for sparse data. The Experiment Unlike several previous approaches to word sense disambiguation ( 29] 5] 7] [10]) nothing in this approach limits the selection of sense tags to a particular number or type of meaning distinctions. In this study, our goal was to address a non trivial case of ambiguity, but one that would allow some comparison of results with previous work. As a result of these ....
....the relationships among the features. The majority of these efforts ( lSl, all) weight each feature in predicting the sense of an ambiguous word in accordance with frequency information, without considering the extent to which the features cooccur with one another. Gale, Church and Yarowsky ([10]) and Yarowsky ( 29] formally characterize the interactions that they consider in their model, but they simply assume that their model fits the data. Other researchers have proposed approaches to systematically combining information from multiple contextual features in determining the sense of ....
Gale, W., Church, K., and Yarowsky, D. (1992a). A Method for Disambiguating Word Senses in a Large Corpus. AT&T Bell Laboratories Statistical Research Repor No. 10d.
....Parts in Very Large Corpora Matthew Berland, Eugene Charniak rab , ec cs. brown. edu Department of Computer Science Brown University, Box 1910 Providence, RI 02912 Abstract We present a method for extracting parts of objects from wholes (e.g. speedometer from car ) Given a very large corpus our method finds part words with 55 accuracy for the top 50 words as ranked by the system. The part list could be scanned by an end user ....
....Parts in Very Large Corpora Matthew Berland, Eugene Charniak rab , ec cs. brown. edu Department of Computer Science Brown University, Box 1910 Providence, RI 02912 Abstract We present a method for extracting parts of objects from wholes (e.g. speedometer from car ) Given a very large corpus our method finds part words with 55 accuracy for the top 50 words as ranked by the system. The part list could be scanned by an end user and added to an ....
[Article contains additional citation context not shown here]
William A. Gale, Kenneth W. Church & David Yarowsky, "A method for disambiguating word senses in a large corpus," Computers and the Humanities (1992).
....have been presented for confusion set disambiguation. The more recent set of techniques includes multiplicative weight update algorithms [4] latent semantic analysis [7] transformation based learning [8] differential grammars [10] decision lists [12] and a variety of Bayesian classifiers [2,3,5]. In all of these papers, the problem is formulated as follows: Given a specific confusion set (e.g. to, two, too ) all occurrences of confusion set members in the test set are replaced by some marker. Then everywhere the system sees this marker, it must decide which member of the confusion set ....
Gale, W. A., Church, K. W., and Yarowsky, D. (1993). A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26:415--439.
....engine. The problem is somewhat less extreme for classification tasks, where we can in some cases arrange to compare posterior log odds of classes for each document individually, without comparisons across documents. Indeed, we know of many applications of multinomial models to text categorization [3, 14, 15, 25, 32, 34] but none to text retrieval. 5.3 Non Distributional Approaches A variety of ad hoc approaches have been developed that more or less gracefully integrate term frequency and document length information into the BIM itself. The widely used probabilistic indexing approach assumes there is an ideal ....
William A. Gale, Kenneth W. Church, and David Yarowsky. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26:415- 439, 1993.
....McCallum obtained impressive harvests of research papers from four Computer Science department sites, and of pages about o#cers and directors from 26 company Websites. Lexical proximity and contextual features have been used extensively in natural language processing for disambiguating word sense [15]. Compared to plain text, DOM trees and hyperlinks give us a richer set of potential features. Aggarwal et al. have proposed an intelligent crawling framework [1] in which only one classifier is used, but similar to our system, that classifier trains as the crawl progresses. They do not use our ....
W. A. Gale, K. W. Church, and D. Yarowsky. A method for disambiguating word senses in a large corpus. Computer and the Humanities, 26:415--439, 1993.
....with the correct sense. The sense labels are typically taken from a dictionary. Given this information, a supervised learning algorithm constructs rules that achieve high discrimination between occurrences of different word senses. Examples of supervised learning methods for WSD appear in [1] [6], 9] 22] 18] The learning methods used in those studies are general purpose, including: decisiontree induction, decision list induction, feed forward neural networks with backpropagation and nave Bayesian learning. Their results are very encouraging, exceeding 90 correct sense labelling in ....
Gale, W. A., Church, K. W. and Yarowsky, D.: A method for disambiguating word senses in a large corpus. Computers and the Humanities, v.26, (1993) 415-439
.... of the application is usually a subset of the entire knowledge base, the contextual specification decreases the number of entity classes that possess the same name (i.e. polysemy) Unfortunately, this approach has the same disadvantage of the use of topical context for word sense disambiguation [34], since it may not distinguish polysemous terms that are semantically similar and belong to the same domain of discourse. The semantic relations among entity classes provide a flexible way to describe context since the specification of one entity class can be used to obtain other entity classes ....
Gale, W., K. Church, and D. Yarowsky, 1992, A Method for Disambiguating Word Senses in a Large Corpus, Computers and Humanities 26(5/6): 415-450.
....like Transformations Based Learning [ Brill, 1995; Brill and Resnik, 1994 ] on the other hand, start with the same set of features as does MBL, but prune the feature space very aggressively, with the goal of outputting a very small list of active features. Bayesian algorithms, Golding, 1995; Gale et al. 1993 ] on the other hand, keep all the features alive, only that they re weigh them as a function of frequency of occurrence. As a result, very predictive features (e.g. long conjunctive features, for which MBL gives very high weight) may be assigned low weights by naive Bayes and other probabilistic ....
W. Gale, K. Church, and D. Yarowsky. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26:415-439, 1993.
....is determined and used in a parametrized calculation to yield the desired probabilities. The calculation is usually based on an underlying statistical model, e.g. assuming that the atomic features are independent allows a simple multiplication of feature dependent parameters (Naive Bayes; cf. e.g. Gale et al. 1993)) As the atomic features are hardly ever independent, more complicated models tend to yield better results (e.g. Maximum Entropy; cf. Mikheev (1998) There are yet other approaches, which do not fall as conveniently in my general classification. Neural Network approaches are best seen as a ....
W. Gale, K. Church, and D. Yarowsky. 1993. A method for disambiguating word senses in a large corpus. Computing in the Humanities, 1:415--439.
No context found.
W. Gale, K. Church, and D. Yarowsky. 1992. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26:415--439.
No context found.
Gale, W., K. Church, and D. Yarowsky, "A Method for Disambiguating Word Senses in a Large Corpus," Computers and the Humanities, 26,415-439, 1992.
No context found.
W. Gale, K. Church, and D. Yarowsky. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26:415-439, 1992.
No context found.
W. Gale, K. Church, and D. Yarowsky. A method for disambiguating word senses in a large corpus. Computers and Humanities, 26(5/6):415--450, 1992.
No context found.
W. A. Gale, K. W. Church, and D. Yarowsky. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26:415#439, 1993.
No context found.
Gale, William A., Kenneth W. Church, and David Yarowsky. 1993. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26:415#439.
No context found.
Gale, W., Church, K., and Yarowsky, D. (1992). A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26, 415--439.
No context found.
William A. Gale, Kenneth W. Church, and David Yarowsky. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26:415--439, 1992.
No context found.
Gale, W., Church, K., and Yarowsky, D. "A Method for Disambiguating Word Senses in a Large Corpus." Computers and the Humanities, 5-6, pp. 415-439. 1992.
No context found.
W. A. Gale, K. W. Church, and D. Yarowsky. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26(5):415--439, 1993.
No context found.
William A. Gale, Kenneth W. Church, and David Yarowsky. A method for disambiguating word senses in a large corpus. Computers and the Humanities, 26(5):415--439, 1993.
No context found.
Gale, W., K. Church, and D. Yarowsky, 1992, A Method for Disambiguating Word Senses in a Large Corpus, Computers and Humanities 26(5/6): 415-450.
No context found.
Gale, W., Church, K., Yarowsky, D., `A Method for Disambiguating Word Senses in a Large Corpus', Statistical Research Reports 104, AT&T Bell Laboratories, 1992.
No context found.
Gale, W. A., Church, K. W. and Yarowsky, D. (1992) A Method for Disambiguating Word Senses in a Large Corpus. Computers and the Humanities 26, 415--439.
No context found.
Gale, William A., Kenneth W. Church, and David Yarowsky. (to appear). A method for disambiguating word senses in a large corpus. In Computers and Humanities.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC