• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

DMCA

Unsupervised Models for Named Entity Classification (1999)

Cached

  • Download as a PDF

Download Links

  • [l2r.cs.uiuc.edu]
  • [l2r.cs.uiuc.edu]
  • [www-lipn.univ-paris13.fr]
  • [www.csail.mit.edu]
  • [people.csail.mit.edu]
  • [acl.ldc.upenn.edu]
  • [www.dfki.de]
  • [www.aclweb.org]
  • [www.aclweb.org]
  • [www.sis.pitt.edu]
  • [aclweb.org]
  • [aclweb.org]
  • [ucrel.lancs.ac.uk]
  • [wing.comp.nus.edu.sg]
  • [www.dfki.de]
  • [wing.comp.nus.edu.sg]
  • [www.sis.pitt.edu]
  • [www.cs.huji.ac.il]
  • [www.cslu.ogi.edu]

  • Save to List
  • Add to Collection
  • Correct Errors
  • Monitor Changes
by Michael Collins , Yoram Singer
Venue:In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora
Citations:531 - 4 self
  • Summary
  • Citations
  • Active Bibliography
  • Co-citation
  • Clustered Documents
  • Version History

BibTeX

@INPROCEEDINGS{Collins99unsupervisedmodels,
    author = {Michael Collins and Yoram Singer},
    title = {Unsupervised Models for Named Entity Classification},
    booktitle = {In Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora},
    year = {1999},
    pages = {100--110}
}

Share

Facebook Twitter Reddit Bibsonomy

OpenURL

 

Abstract

This paper discusses the use of unlabeled examples for the problem of named entity classification. A large number of rules is needed for coverage of the domain, suggesting that a fairly large number of labeled examples should be required to train a classifier. However, we show that the use of unlabeled data can reduce the requirements for supervision to just 7 simple “seed ” rules. The approach gains leverage from natural redundancy in the data: for many named-entity instances both the spelling of the name and the context in which it appears are sufficient to determine its type. We present two algorithms. The first method uses a similar algorithm to that of (Yarowsky 95), with modifications motivated by (Blum and Mitchell 98). The second algorithm extends ideas from boosting algorithms, designed for supervised learning tasks, to the framework suggested by (Blum and Mitchell 98). 1

Keyphrases

named entity classification    large number    approach gain    unlabeled example    entity classification    unlabeled data    first method    second algorithm extends idea    natural redundancy    similar algorithm    simple seed rule    many named-entity instance    supervised learning task    labeled example   

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University