See this document in CiteSeerX!

Learning to Extract Entities from Labeled and Unlabeled Text (2005)  (Make Corrections)  
Rosie Jones



  Home/Search   Context   Related

 
View or download:
cmu.edu/Research/T...RosieJones2005.pdf
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  cmu.edu/Research/thesis (more)
(Enter author homepages)

Rate this article: (best)
  Comment on this article  
(Enter summary)

Abstract: Imagine trying to build a system to identify people, locations and organizations, or other arbitrary types, in a human language you are not familiar with. If we knew what kinds of words represent the classes people, locations and organizations, by examining enough text data they occur in, we could learn to recognize the contexts they occur in. And if we knew what kind of contexts they occur in, we could recognize instances of these classes themselves. In this work we address this... (Update)

Active bibliography (related documents):   More   All
1.0:   An Augmented PAC Model for SemiSupervised Learning - Balcan, Blum (2005)   (Correct)
0.5:   Text Clustering with Extended User Feedback - Huang, Mitchell (2006)   (Correct)
0.5:   High Precision Information Extraction - Rich Caruana Center   (Correct)

Similar documents based on text:
5.0:   Unknown -   (Correct)

BibTeX entry:   (Update)

@misc{ jones-learning,
  author = "Rosie Jones",
  title = "Learning to Extract Entities from Labeled and Unlabeled Text",
  url = "citeseer.ist.psu.edu/jones05learning.html" }
Citations (may not include all citations):
2319   Elements of information theory (context) - Cover, Thomas - 1991
500   Numerical recipes in C (context) - Press, Teukolsky et al. - 1992
416   Information retrieval - Van Rijsbergen - 1979
328   Foundations of statistical natural language processing - Manning, Schutze - 1999
180   Combining labeled and unlabeled data with co-training - Blum, Mitchell - 1998
140   A comparison of event models for naive Bayes text classifica.. - McCallum, Nigam - 1998
135   A sequential algorithm for training text classifiers - Lewis, Gale - 1994
128   the optimality of the simple Bayesian classifier under zero-.. - Domingos, Pazzani - 1997
125   Basic objects in natural categories (context) - Rosch, Mervis et al. - 1976
123   Probabilistic latent semantic indexing - Hofmann - 1999
114   Five papers on wordnet (context) - Miller, Beckwith et al. - 1997
105   Learning information extraction rules for semi-structured an.. - Soderland - 1999
103   at forty: The independence assumption in information retriev.. (context) - Lewis - 1998
88   Class-based n-gram models of natural language - Brown, Pietra et al. - 1992
85   small-world (context) - Watts, Strogatz - 1998
84   Statistical mechanics of complex networks - Albert, Barabasi - 2002
80   Learning to classify text from labeled and unlabeled documen.. - Nigam, McCallum et al. - 1998
71   Parsing English with a link grammar - Sleator, Temperley - 1993
62   Selective sampling using the query by committee algorithm - Freund, Seung et al. - 1997
58   Distributional clustering of words for text classification - Baker, McCallum - 1998
50   Learning hidden Markov model structure for information extra.. - Seymore, McCallum et al. - 1999
42   Unsupervised models for named entity classification - Collins, Singer - 1999
42   Learning to classify text using support vector machines (context) - Joachims - 2001
39   Accurately and reliably extracting data from the web:a machi.. - Knoblock, Lerman et al. - 2000
38   Empirical methods in information extraction - Cardie - 1997
38   Active learning with committees for text categorization (context) - Liere, Tadepalli - 1997
36   Extraction patterns for information extraction tasks: A surv.. (context) - Muslea - 1999
35   Employing EM in pool-based active learning for text classifi.. - McCallum, Nigam - 1998
31   Relational learning of pattern-match rules for information e.. - Mooney - 1997
27   Active learning for natural language parsing and information.. - Thompson, Cali et al. - 1999
26   Multistrategy learning for information extraction - Freitag - 1998
20   Selective sampling with redundant views - Muslea, Minton et al. - 2000
19   Similarity-based models of word cooccurrence probabilities - Dagan, Lee et al. - 1998
19   A flexible learning system for wrapping tables and lists in .. - Cohen, Hurst et al. - 2002
19   Exploring complex networks (context) - Strogatz - 2001
18   Improving text clasification by shrinkage in a hierarchy of .. (context) - McCallum, Rosenfeld et al. - 1998
18   Learning Dictionaries for Information Extraction Using Multi.. (context) - Jones - 1999
15   Random graph models of social networks - Newman, Watts et al. - 2002
14   Engineering a multi-purpose test collection for Web retrieva.. - Bailey, Craswell et al. - 2003
14   The small world of human language - Cancho, Sole - 2001
13   Bootstrapping for text learning tasks - Jones, Nigam et al. - 1999
12   Partially supervised classification of text documents - Liu, Lee et al. - 2002
12   Linkage and autocorrelation cause feature selection bias in .. - Jensen, Neville - 2002
11   Transductive learning via spectral graph partitioning - Joachims - 2003
11   Similarity-based approaches to natural language processing - Lee - 1997
10   Unlabeled data can degrade classification performance of gen.. - Cozman, Cohen - 2002
10   The interpretation of tables in texts (context) - Hurst - 2000
10   Introduction to information extraction technology (context) - Appelt, Israel - 1999
9   Partially supervised clustering for image segmentation (context) - Bensaid, Hall et al. - 1996
9   Transformational grammar: A first course (context) - Radford - 1988
8   Interpreting and extending classical agglomerative clusterin.. - Kamvar, Klein et al. - 2002
8   ectiveness and applicability of co-training (context) - Nigam, Ghani et al. - 2000
8   Inducing a semantically annotated lexicon via EM-based clust.. - Rooth, Riezler et al. - 1999
7   Building minority language corpora by learning to generate w.. - Ghani, Jones et al. - 2003
5   The large-scale structure of semantic networks: Statistical .. - Steyvers, Tenenbaum - 2005
5   The global organization of the WordNet lexicon (context) - Sigman, Cecchi - 2002
5   A bootstrapping method for learning semantic lexicons using .. - Thelen, Rilo - 2003
5   Learning information extraction patterns from examples (context) - man - 1996
5   Syntax and lexical statistics in anaphora resolution (context) - Dagan, Justeson et al. - 1995
4   Exploiting relations among concepts to acquire weakly labele.. - Bockhorst, Craven - 2002
4   The University of South Florida word association norms (context) - Nelson, McEvoy et al. - 1999
3   Cross-document coreference on a large scale corpus (context) - Gooi, Allan - 2004
3   Co-training and expansion: Towards bridging theory and pract.. - Balcan, Bluem et al. - 2004
3   Semi-Markov conditional random fields for information extrac.. - Sarawagi, Cohen - 2004
3   Small-world file-sharing communities - Iamnitchi, Ripeanu et al. - 2004
2   Automatic training data collection for semi-supervised learn.. (context) - Ghani, Jones - 2002
2   Examining machine learning for adaptable end-toend informati.. - Glickman, Jones - 1999
2   Modeling query-based access to text databases - Agichtein, Ipeirotis et al. - 2003
2   Interactive feature selection (context) - Raghavan, Madani et al. - 2005
2   Unsupervised learning of contextual role knowledge for coref.. (context) - Bean, Rilo - 2004
2   Semi-supervised learning using randomized mincuts - Blum, La et al. - 2004
1   Epidemic spreading in scale-free networks (context) - Pastor-Sartoras, Vespignani - 2001
1   An introduction to the sundance and autoslog systems (context) - Phillips - 2004
1   Signatures of small-world and scale-free properties in large.. (context) - de Moura, Lai et al. - 2003
1   High-recall protein entity recognition using a dictionary (context) - Kou, Cohen et al. - 2005
1   Why social networks are di#erent from other types of network.. (context) - Newman, Park - 2003
1   Problems with fitting to the power-law distribution (context) - Goldstein, Morris et al. - 2004
1   Extracting relations from large text collections (context) - Agichtein - 2004
1   Random sampling in cut (context) - Karger - 1994

Documents on the same site (http://www.lti.cs.cmu.edu/Research/thesis.html):   More
Learning Transfer Rules for Machine Translation with Limited Data - Probst (2005)   (Correct)
Robust Interactive Dialogue Interpretation - Rose (1997)   (Correct)
Helping Children Learn Vocabulary during Computer-Assisted Oral.. - Aist (2000)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC