| S. Scott and S. Matwin. Text Classification using WordNet Hypernyms. In S. Harabagiu, editor, Use of WordNet in Natural Language Processing Systems: Proceedings of the Conference, pages 38--44. Association for Computational Linguistics, Somerset, New Jersey, 1998. |
....takes place nowadays. Most work in improving document categorization systems seems to be focused on representation processing, and of course on classification algorithms. There are important exceptions of course, such as [10] where the usefulness of simple n gram representations is discussed or [11] proposing interesting thesaurus based representation, to name a few. However in most cases a simple bag of words or unigram representation is used. This Processing (scaling, attribute selection etc. Classic DM categorization algorithm actual classification Classic DM categorization ....
Matwin S., Scott S., "Text Classification Using WordNet Hypernyms", Computer Science Dept., University of Ottawa, 1998
....of a sentence. We provide a detailed comparison elsewhere [29] including other work in preposition disambiguation. Syntactic functional relations are important as well. Dini et al. 30] show how relations extracted from parse annotations facilitate word sense disambiguation. Scott and Matwin [31] also use WordNet hypernyms for classification, in particular topic detection. Their approach is di#erent in that they include a numeric density feature for each synset that subsumes words appearing in the document, potentially yielding hundreds of features. We just have a binary feature for each ....
Scott, S., Matwin, S.: Text classification using WordNet hypernyms. In Harabagiu, S., ed.: Use of WordNet in Natural Language Processing Systems: Proceedings of the Conference, Somerset, New Jersey, Association for Computational Linguistics (1998) 38--44
....without training nor the need for tagged corpora. Introduction The present work consists on the comparison of two different ways of using WordNet for text classification purposes. We will suport the conclusions of the experiments depicted in Text classification using WordNet hypernyms (Scott and Matwin,1998) on the basis that WordNet may aid machine learning techniques in Information Retrieval. We will propose a different approach however, more efficient and more reliable, since it does not require training and obtains results solely on search procedures and text distribution quantification. In ....
....obtains results solely on search procedures and text distribution quantification. In Section 1, a brief overview to Princeton s WordNet is made and to authors that use this ontology for diverse aplicattions from word sense disambiguation to Information Retrieval. In Section 2, the article by Scott and Matwin (1998) is briefly presented and analyzed, with special focus on the complexity of the training method, and general efficiency issues. Finally, Section 3 presents the current proposal and provides an execution example, followed by the conclusion. 1 WordNet in Text Classification Wordnet (Miller, 1990; ....
[Article contains additional citation context not shown here]
Scott S. and Matwin S. (1998) Text classification using WordNet hypernyms. In "Proceedings of the COLING/ACL Workshop on Usage of WordNet in Natural Language Processing Systems", Montreal.
....and rule learning algorithms. Mladenic 98 (1998) selects variable length phrases for text classification of web pages into the Yahoo hierarchy. Two studies have incorporated into text classification information from WordNet, a semantic network of the English language (Rodriguez et al. 1997; Scott Matwin, 1998). 5.2 Learning with Labeled and Unlabeled Data We turn now to a survey of combining labeled and unlabeled data. Much initial work in this area started in statistics, and has recently been joined by the machine learning community. 5.2.1 Likelihood maximization approaches The idea of learning ....
Scott, S., & Matwin, S. (1998). Text classification using WordNet hypernyms. Usage of WordNet in Natural Language Processing Systems: Proceedings of the Workshop, pp. 45--52.
....Vector Machines on word stem vectors. Word features figure in a number of systems (Papka Allen, 1998; Larkey Croft, 1996) Liddy et al. 1994) categorize texts exclusively by semantic codes assigned to words from a machine readable dictionary. Apt et al. 1994) use topic specific dictionaries, Scott Matwin (1998) use WordNet hypernyms. Statistical approaches are also popular (Wilbur, 1996; see Yang, 1999 for an overview) Common elements in these systems are the primacy of the word in some form, a presumption of complete automation, and (in most cases) the absence of a pre existing taxonomy. The latter ....
SCOTT, S. & S. MATWIN (1998). Text Classification Using WordNet Hypernyms. Proceedings of the COLING-ACL'98 Workshop on Usage of WordNet in Natural Language Processing Systems, pp.45-52.
....Thus, designing automated procedures for news categorization becomes a practical as well as a challenging problem. A typical solution uses a training set to extract the features that characterize the individual news categories. The techniques employed include machine learning [2] 10] 13][21][23] 25] statistical [18] 27] knowledge based [15] or the combinations [9] After the initial training stage, the news categorization system may apply periodic maintenance to avoid performance deterioration caused by the presence of new terms and new topics being discussed in the news ....
....which improves precision, and identifies equivalent terms which improves recall. They have shown up to 29 improvement in performing text retrieval as compared to the standard SMART approach. The authors concluded that their technique is particularly useful if queries are ambiguous. Another work [21] uses a machine learning technique for text classification. The authors explore the use of linguistic knowledge for text representation, and show that their approach of representing the text based on WordNet hypernyms leads to significant improvement in accuracy. An application of text ....
[Article contains additional citation context not shown here]
S. Scott, and S. Matwin. "Text Classification Using WordNet Hypernyms", Coling-ACL'98 Workshop: Usage of WordNet in Natural Language Processing Systems, pp. 45-51, August 1998.
....on the kinds of nouns in the sentence with it than nouns are on the kinds of verbs present [Fell90] Because of this, we believe that using nouns only will be sufficient to represent a web page. Because only nouns will be used to represent a web page, ideally, a program such as the Brill tagger [Scott98] could be used to identify the nouns in a web page so that only the orange citrus edible fruit apple 15 nouns would be fed to WordNet. Since incorporating the Brill tagger into this work was beyond the scope of this thesis, WordNet was used to determine if a word was a noun and, if so, it ....
....as the default. This list is used to create the feature vectors representing each web page. The size of the vectors will be equal to the number of replacement terms, which will be less than or equal to the number of terms on the master list. The feature vectors use the bag of words representation [Scott98] each element of the feature vector represents one of the replacement terms. In this case, the values are binary; for each term in the original web page, the corresponding element in the feature vector will have a value of one. The elements in the feature vector that represent terms not found ....
[Article contains additional citation context not shown here]
Scott, Sam and Stan Matwin, "Text classification using WordNet hypernyms," Proceedings of the COLING/ACL Workshop on Usage of WordNet in Natural Language Processing Systems, Montreal, 1998.
.... the experiments performed on the corpora: chapter 5 presents the baseline experiments using the bag of words model, chapter 6 presents the experiments using phrase based representations, and chapter 7 presents the experiments using WordNet, including some preliminary work previously made public as [SCM98a] and [SCM98b] Part three of the document contains the final conclusions and suggestions for future work followed by appendices, and references. Previous work is introduced as it becomes relevant to the discussion. Feature Engineering for a Symbolic Approach to Text Classification 1998 Sam ....
....ignored. In this example, only the relevant SGML fields are shown. Feature Engineering for a Symbolic Approach to Text Classification 1998 Sam Scott 11 2. 2 The Digital Tradition The Digital Tradition (DigiTrad) is a new corpus for text classification research introduced by Scott and Matwin [SCM98a]. It is a freely available collection of folk song lyrics maintained by the folk song enthusiasts at the MudCat Caf [GRE96] 5 In order to aid users, the owners of DigiTrad have bundled the files with a third party search engine and assigned to most songs a set of keywords from a fixed list. ....
[Article contains additional citation context not shown here]
Sam Scott and Stan Matwin. Text Classification Using WordNet Hypernyms. In Usage of WordNet in Natural Language Processing Systems: Proceedings of the Workshop (COLING-ACL'98). August, 1998. 45-51.
No context found.
S. Scott and S. Matwin. Text Classification using WordNet Hypernyms. In S. Harabagiu, editor, Use of WordNet in Natural Language Processing Systems: Proceedings of the Conference, pages 38--44. Association for Computational Linguistics, Somerset, New Jersey, 1998.
No context found.
S. Scott and S. Matwin. Text classification using WordNet hypernyms. In Proc. Workshop Usage of WordNet in Natural Language Processing Systems, pages 45--52, August 1998.
No context found.
Sam Scott and Stan Matwin. 1998. Text classification using WordNet hypernyms. In Sanda Harabagiu, editor, Use of WordNet in Natural Language Processing Systems: Proceedings of the Conference, pages 38--44. Association for Computational Linguistics, Somerset, New Jersey.
No context found.
Scott, S., Matwin, S.: Text classification using WordNet hypernyms. In: Proc. Workshop Usage of WordNet in Natural Language Processing Systems. (1998) 45--52
No context found.
Scott, S., Matwin, S.: Text Classification using WordNet Hypernyms. In the Proceeding of Workshop -- Usage of WordNet in Natural Language Processing Systems, Montreal, Canada (1998)
No context found.
Scott, S., Matwin, S.: Text Classification using WordNet Hypernyms. In the Proceeding of Workshop -- Usage of WordNet in Natural Language Processing Systems, Montreal, Canada (1998)
No context found.
S. Scott and S. Matwin. Text classification using WordNet hypernyms. In Usage of WordNet in Natural Language Processing Systems, 1998. 7
No context found.
Scott, S., Matwin, S.: Text classification using WordNet hypernyms. In: Proc. Workshop Usage of WordNet in Natural Language Processing Systems. (1998) 45--52
No context found.
S. Scott and S. Matwin. Text classification using wordnet hypernyms. In Proc. workshop Usage of WordNet in Natural Language Processing Systems, pages 45--52, August 1998.
No context found.
Scott, S., Matwin, S.: Text Classification Using WordNet Hypernyms. Proceedings of the COLING/ACL Workshop on Usage of WordNet in Natural Language Processing Systems (1998) 38-44
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC