Download:
|
by Sarah Zelikovitz, Haym Hirsh
IJCAI01 Workshop Notes on Text Learning: Beyond Supervision
http://www-2.cs.cmu.edu/~mccallum/textbeyond/papers/zelikovi.ps
Add To MetaCart
Abstract:
We present work in progress that uses Latent Semantic Indexing (LSI) in conjunction with background knowledge and unlabeled examples to improve text classification accuracy. The singular value decomposition (SVD) that is performed by LSI is done on an expanded term by document matrix that includes the labeled training examples as well as the unlabeled examples. We report classification accuracy on different data sets both with and withoutthe inclusion of background knowledge and
Citations
|
1463
|
Indexing by Latent Semantic Analysis
– Deerwester, Dumais, et al.
- 1990
|
|
981
|
An algorithm for suffix stripping
– Porter
- 1980
|
|
961
|
Text Categorization with Support Vector Machines
– Joachims
- 1997
|
|
575
|
Combining labeled and unlabeled data with co-training
– Blum, Mitchell
- 1998
|
|
562
|
Automatic Text Processing
– Salton
- 1989
|
|
454
|
Text classification from labeled and unlabeled documents using EM
– Nigam, McCallum, et al.
|
|
304
|
Transductive inference for text classification using support vector machines
– Joachims
- 1999
|
|
277
|
A sequential algorithm for training text classifiers
– Lewis, Gale
- 1994
|
|
255
|
Learning to extract symbolic knowledge from the World Wide Web
– Craven, DiPasquo, et al.
- 1998
|
|
170
|
Improving the Retrieval of Information from External Sources
– Dumais
- 1991
|
|
167
|
Integration of Heterogeneous Databases Without Common Domains Using Queries Based on Textual Similarity
– Cohen
- 1998
|
|
151
|
Heterogeneous Uncertainty Sampling for Supervised Learning
– Lewis, Catlett
- 1994
|
|
61
|
A web-based information system that reasons with structured collections of text
– Cohen
- 1998
|
|
39
|
The role of unlabeled data in supervised learning
– Mitchell
- 1999
|
|
38
|
LSI meets TREC: A status report
– Dumais
- 1993
|
|
26
|
Text classification by bootstrapping with keywords, EM and shrinkage
– McCallum, Nigam
- 1999
|
|
23
|
An example-based mapping method for text classification and retrieval
– Yang, Chute
- 1994
|
|
23
|
Improving short-text classification using unlabeled background knowledge to assess document similarity.” ICML-2000
– Zelikovitz, Hirsh
|
|
19
|
Bootstrapping for text learning tasks
– Jones, McCallum, et al.
- 1999
|
|
15
|
Joins that generalize: Text categorization using WHIRL
– Cohen, Hirsh
- 1998
|
|
14
|
Machine learning in automated text categorisation
– Sebastiani
|
|
9
|
Using LSI for information filtering: TREC-3 experiments. In: D. Harman (Ed.), The Third Text REtrieval
– Dumais
- 1995
|
|
7
|
Nigam and Rayid Ghani: Analyzing the effectiveness and applicability of co-training
– Kamal
- 2000
|
|
1
|
Semi-Supervised Support Vector
– Bennet, Demiriz
- 1998
|