Results 1 - 10
of
6,628
Text Classification using String Kernels
"... We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily contiguo ..."
Abstract
-
Cited by 495 (7 self)
- Add to MetaCart
We propose a novel approach for categorizing text documents based on the use of a special kernel. The kernel is an inner product in the feature space generated by all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily
Transductive Inference for Text Classification using Support Vector Machines
, 1999
"... This paper introduces Transductive Support Vector Machines (TSVMs) for text classification. While regular Support Vector Machines (SVMs) try to induce a general decision function for a learning task, Transductive Support Vector Machines take into account a particular test set and try to minimiz ..."
Abstract
-
Cited by 892 (4 self)
- Add to MetaCart
This paper introduces Transductive Support Vector Machines (TSVMs) for text classification. While regular Support Vector Machines (SVMs) try to induce a general decision function for a learning task, Transductive Support Vector Machines take into account a particular test set and try
An extensive empirical study of feature selection metrics for text classification
- J. of Machine Learning Research
, 2003
"... Machine learning for text classification is the cornerstone of document categorization, news filtering, document routing, and personalization. In text domains, effective feature selection is essential to make the learning task efficient and more accurate. This paper presents an empirical comparison ..."
Abstract
-
Cited by 496 (15 self)
- Add to MetaCart
Machine learning for text classification is the cornerstone of document categorization, news filtering, document routing, and personalization. In text domains, effective feature selection is essential to make the learning task efficient and more accurate. This paper presents an empirical comparison
Text Classification from Labeled and Unlabeled Documents using EM
- MACHINE LEARNING
, 1999
"... This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large qua ..."
Abstract
-
Cited by 1033 (15 self)
- Add to MetaCart
This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large
Using Maximum Entropy for Text Classification
, 1999
"... This paper proposes the use of maximum entropy techniques for text classification. Maximum entropy is a probability distribution estimation technique widely used for a variety of natural language tasks, such as language modeling, part-of-speech tagging, and text segmentation. The underlying principl ..."
Abstract
-
Cited by 326 (6 self)
- Add to MetaCart
This paper proposes the use of maximum entropy techniques for text classification. Maximum entropy is a probability distribution estimation technique widely used for a variety of natural language tasks, such as language modeling, part-of-speech tagging, and text segmentation. The underlying
A comparison of event models for Naive Bayes text classification
, 1998
"... Recent work in text classification has used two different first-order probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multi-variate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e.g. Larkey ..."
Abstract
-
Cited by 1025 (26 self)
- Add to MetaCart
Recent work in text classification has used two different first-order probabilistic models for classification, both of which make the naive Bayes assumption. Some use a multi-variate Bernoulli model, that is, a Bayesian Network with no dependencies between words and binary word features (e
Support Vector Machine Active Learning with Applications to Text Classification
- JOURNAL OF MACHINE LEARNING RESEARCH
, 2001
"... Support vector machines have met with significant success in numerous real-world learning tasks. However, like most machine learning algorithms, they are generally applied using a randomly selected training set classified in advance. In many settings, we also have the option of using pool-based acti ..."
Abstract
-
Cited by 735 (5 self)
- Add to MetaCart
Support vector machines have met with significant success in numerous real-world learning tasks. However, like most machine learning algorithms, they are generally applied using a randomly selected training set classified in advance. In many settings, we also have the option of using pool-based active learning. Instead of using a randomly selected training set, the learner has access to a pool of unlabeled instances and can request the labels for some number of them. We introduce a new algorithm for performing active learning with support vector machines, i.e., an algorithm for choosing which instances to request next. We provide a theoretical motivation for the algorithm using the notion of a version space. We present experimental results showing that employing our active learning method can significantly reduce the need for labeled training instances in both the standard inductive and transductive settings.
Distributional Clustering of Words for Text Classification
, 1998
"... This paper describes the application of Distributional Clustering [20] to document classification. This approach clusters words into groups based on the distribution of class labels associated with each word. Thus, unlike some other unsupervised dimensionalityreduction techniques, such as Latent Sem ..."
Abstract
-
Cited by 298 (1 self)
- Add to MetaCart
This paper describes the application of Distributional Clustering [20] to document classification. This approach clusters words into groups based on the distribution of class labels associated with each word. Thus, unlike some other unsupervised dimensionalityreduction techniques, such as Latent
Feature engineering for text classification
- Proceedings of ICML-99, 16th International Conference on Machine Learning
, 1999
"... Most research in text classification has used the “bag of words ” representation of text. This paper examines some alternative ways to represent text based on syntactic and semantic relationships between words (phrases, synonyms and hypernyms). We describe the new representations and try to justify ..."
Abstract
-
Cited by 129 (1 self)
- Add to MetaCart
Most research in text classification has used the “bag of words ” representation of text. This paper examines some alternative ways to represent text based on syntactic and semantic relationships between words (phrases, synonyms and hypernyms). We describe the new representations and try to justify
Results 1 - 10
of
6,628