| Leacock, Claudia, Geoffrey Towell, and Ellen Voorhees (1993b). "Towards building contextual representations of word senses using statistical models", in Proceedings of the SIGLEX Workshop: Acquisition of Lexical Knowledge from Text. |
....(1993) but its extension to a systematic disambiguation of complex NPs has to be preceded by the two following tasks: the extraction from technical corpora of subcategorization information for any technical domain, the study of con ict resolution strategies in case of competing associations. Leacock et al. 1993) propose a method for extracting automatically contextual representations from large corpora. Two types of contexts are extracted: local information on words immediately surrounding a word and topical information on substantive words that are likely to co occur in the same sentence. Local context ....
....could be extended to external modi cation the tool is general enough for this purpose. As external modi cation is less constrained than internal one, it would be necessary to have a good characterization of the linguistic context of reference terms. Local contexts, extracted from large corpora (Leacock et al. 1993), reveal selectional restrictions and could be exploited for selecting relevant external modi ers. Contrary to external variants, variants where content words are elided cannot be extracted in the framework proposed in this study: Without a correct identi cation of anaphora, the processing of ....
Leacock, C., Towell, G., & Voorhes, E. (1993). Towards Building Contextual Representations of Word Senses Using Statistical Models. In Proceedings, SIGLEX workshop: Acquisition of Lexical Knowledge from Text, ACL.
....words immediately to the right and left of the target word to be the context in which it appears. This narrow window is called the local context of the word [7] One could also look at other words with which the target word co occurs. This broader definition of context is known as topical context [5, 6]. Many approaches to word sense disambiguation use statistics gleaned from large amounts of text that are hand tagged with the correct answers [5, 6, 7] to analyze a new text. The handtagged text is known as training data. Though local context has been shown to be e#ective in disambiguating the ....
....word [7] One could also look at other words with which the target word co occurs. This broader definition of context is known as topical context [5, 6] Many approaches to word sense disambiguation use statistics gleaned from large amounts of text that are hand tagged with the correct answers [5, 6, 7] to analyze a new text. The handtagged text is known as training data. Though local context has been shown to be e#ective in disambiguating the senses of a particular word form [5] the use of topical context has the advantage that it makes maximal use of small sets of training data. Maximizing ....
Claudia Leacock, Geo#rey Towell, and Ellen Voorhees. Towards building contextual representations of word senses using statistical models. In Corpus Processing for Lexical Aquisition, pages 97--113. MIT Press, Cambridge MA, 1996.
....a wide variety of topical contexts, such as common verbs, which conversely may be easily identified by local collocational, syntactic or selectional cues, or even by frequency information. A comparison of three different wide window statistically based techniques was conducted by Leacock et al. [22]. Specifically, they compared the performance of a Bayesian classifier, a context vector, and a neural network, trained on the same corpus with the same context window of the current and preceding sentence. On a two way sense disambiguation task, all achieved greater than 90 accuracy. For three ....
....to operate. It also suggests the limitations of the pair wise disambiguation task as a metric for evaluating the techniques; clearly, even a small increase in the number of senses dramatically changes the difficulty of the task. 9 Particularly revealing is an additional study by Leacock et al. [22] in which human subjects were given the same disambiguation tasks to perform with three different types of information available: first the two original context sentences, then the context sentences with the words all randomly ordered, and finally only the randomly ordered content words. While the ....
Claudia Leacock, Geoffrey Towell, and Ellen M. Voorhees. Toward building contextual representations of word senses using statistical models. In Proceedings of the 1993 ACL SIGLEX Workshop - Acquisition of Lexical Knowledge from Text, 1993.
....The approach fares well when the different senses of the word occur in different domains, which tends to occur when they are at the homograph end of the homograph polysemy spectrum. To discriminate finer grained senses, or to reach beyond an accuracy threshold, a richer feature set is required (Leacock, Towell, and Vorhees, 1993). The research focuses on algorithms: most could be used with any feature set. One could in principle use word, position pairs as features (with position defined relative to the nodeword and ranging from, say, 3 to 3) Parts of speech, lemmas, bracketings are all sometimes optimal ways to ....
....comes with a cost, in terms of escalating numbers of features and correspondingly sparse data. It is only viable to extend the repertoire of features if one also introduces methods for determining which are salient for each word. Papers exploring this route in different ways are (Hearst, 1991; Leacock, Towell, and Vorhees, 1993; Yarowsky, 1995; Pedersen, Bruce, and Wiebe, 1997) Note that if one sees the lexicon generation phase of c WSD as a one off, resource development activity, it becomes viable to spend substantially longer on it than if it is seen as a regularly repeated compile time activity. 7 Relation to ....
Leacock, Claudia, Geoffrey Towell, and Ellen Vorhees. 1993. Towards building contextual representations of word senses using statistical models.
....but its extension to a systematic disambiguation of complex NPs has to be preceded by the two following tasks: ffl the extraction from technical corpora of subcategorization information for any technical domain, ffl the study of conflict resolution strategies in case of competing associations. Leacock et al. 1993) propose a method for extracting automatically contextual representations from large corpora. Two types of contexts are extracted: local information on words immediately surrounding a word and topical information on substantive words that are likely to co occur in the same sentence. Local context ....
....be extended to external modification the tool is general enough for this purpose. As external modification is less constrained than internal one, it would be necessary to have a good characterization of the linguistic context of reference terms. Local contexts, extracted from large corpora (Leacock et al. 1993), reveal selectional restrictions and could be exploited for selecting relevant external modifiers. Contrary to external variants, variants where content words are elided cannot be extracted in the framework proposed in this study: Without a correct identification of anaphora, the processing of ....
Leacock, C., Towell, G., & Voorhes, E. (1993). Towards Building Contextual Representations of Word Senses Using Statistical Models. In Proceedings, SIGLEX workshop: Acquisition of Lexical Knowledge from Text, ACL.
No context found.
Leacock, Claudia, Geoffrey Towell, and Ellen Voorhees (1993b). "Towards building contextual representations of word senses using statistical models", in Proceedings of the SIGLEX Workshop: Acquisition of Lexical Knowledge from Text.
No context found.
C. Leacock. Towards building contextual representations of word senses using statistical models. In R. Boguraev and J. Pustejowsky, editors, Corpus Processing for Lexical Acquisition. MIT Press, Cambridge, MA, 1996.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC