• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

DMCA

Finding Predominant Word Senses in Untagged Text (2004)

Cached

  • Download as a PDF

Download Links

  • [ftp.informatics.susx.ac.uk]
  • [nlp.cs.swarthmore.edu]
  • [acl.ldc.upenn.edu]
  • [www.dianamccarthy.co.uk]
  • [www.ims.uni-stuttgart.de]
  • [acl.ldc.upenn.edu]
  • [aclweb.org]
  • [aclweb.org]
  • [wing.comp.nus.edu.sg]
  • [aclweb.org]
  • [www.aclweb.org]
  • [www.aclweb.org]
  • [www.cs.swarthmore.edu]
  • [wing.comp.nus.edu.sg]
  • [www2.denizyuret.com]

  • Other Repositories/Bibliography

  • DBLP
  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Diana Mccarthy , Rob Koeling , Julie Weeds , John Carroll
Venue:In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics
Citations:108 - 4 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Mccarthy04findingpredominant,
    author = {Diana Mccarthy and Rob Koeling and Julie Weeds and John Carroll},
    title = {Finding Predominant Word Senses in Untagged Text},
    booktitle = {In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics},
    year = {2004},
    pages = {280--287}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

In word sense disambiguation (WSD), the heuristic of choosing the most common sense is extremely powerful because the distribution of the senses of a word is often skewed. The problem with using the predominant, or first sense heuristic, aside from the fact that it does not take surrounding context into account, is that it assumes some quantity of handtagged data. Whilst there are a few hand-tagged corpora available for some languages, one would expect the frequency distribution of the senses of words, particularly topical words, to depend on the genre and domain of the text under consideration. We present work on the use of a thesaurus acquired from raw textual corpora and the WordNet similarity package to find predominant noun senses automatically. The acquired predominant senses give a precision of 64% on the nouns of the SENSEVAL- 2 English all-words task. This is a very promising result given that our method does not require any hand-tagged text, such as SemCor. Furthermore, we demonstrate that our method discovers appropriate predominant senses for words from two domainspecific corpora.

Keyphrases

untagged text    predominant word sens    predominant sens    wordnet similarity package    method discovers    domainspecific corpus    frequency distribution    word sense disambiguation    hand-tagged corpus    raw textual corpus    common sense    english all-words task    hand-tagged text    predominant noun sens    present work    topical word    promising result    first sense heuristic   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University