See this document in CiteSeerX!

Scaling to Very Very Large Corpora for Natural Language Disambiguation (2001)  (Make Corrections)  (6 citations)
Michele Banko, Eric Brill
Meeting of the Association for Computational Linguistics



  Home/Search   Context   Related

 
View or download:
upenn.edu/P/P01/P011005.pdf
microsoft.com/~brill/Pubs...ACL2001.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  upenn.edu/P/P01/ (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: The amount of readily available on-line text has reached hundreds of billions of words and continues to grow. Yet for most core natural language tasks, algorithms continue to be optimized, tested and compared after training on corpora consisting of only one million words or less. In this paper, we evaluate the performance of different learning methods on a prototypical natural language disambiguation task, confusion set disambiguation, when trained on orders of magnitude more... (Update)

Cited by:   More
Web Text Corpus for Natural Language Processing - Liu, Curran (2006)   (Correct)
Weakly Supervised Learning Methods for Improving the Quality of.. - Wellner (2005)   (Correct)
Web-Scale Information Extraction in KnowItAll - Etzioni, Cafarella, Downey.. (2004)   (Correct)

Similar documents based on text:   More   All
1.0:   Mitigating the Paucity-of-Data Problem: Exploring the Effect.. - Banko, Brill (2001)   (Correct)
0.8:   Pattern-Based Disambiguation for Natural Language Processing - Eric Brill Microsoft (2000)   (Correct)
0.6:   Data-Intensive Question Answering - Brill, Lin, Banko, Dumais, Ng (2001)   (Correct)

Related documents from co-citation:   More   All
2:   LCC tools for question answering - Moldovan, Harabagiu et al.
2:   Extracting patterns and relations from the world wide web - Brin - 1998
2:   Overview of the TREC - Voorhees - 2001

BibTeX entry:   (Update)

Michele Banko and Eric Brill. Scaling to very very large corpora for natural language disambiguation. In Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics, pages 26--33. Association for Computational Linguistics, 2001. http://citeseer.ist.psu.edu/banko01scaling.html   More

@inproceedings{ banko01scaling,
    author = "Michele Banko and Eric Brill",
    title = "Scaling to Very Very Large Corpora for Natural Language Disambiguation",
    booktitle = "Meeting of the Association for Computational Linguistics",
    pages = "26-33",
    year = "2001",
    url = "citeseer.ist.psu.edu/banko01scaling.html" }
Citations not processed or no citations identified.



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://acl.ldc.upenn.edu/P/P01/):   More
Grammars for Local and Long Dependencies. - Dikovsky   (Correct)
Practical Issues in Compiling Typed Unification Grammars.. - Dowding, Hockey, Gawron (2001)   (Correct)
Evaluating CETEMPúblico, a free resource for Portuguese - Santos, Rocha (2001)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC