| T. Joachims. The maximum-margin approach to learning text classi ers: methods, theory, and algorithms. PhD thesis, Arti cial Intelligence Unit, Department of Computer Science, University of Dortmund, February 2001. |
....function and feasible region are convex, it becomes a convex QP, and can be solved more easily than general quadratic programming. Furthermore, there exist algorithms that can solve this more e#ciently by utilizing the sparseness of the text data. Here we discuss the algorithm used in SVM Light [7], which is one of the most popular SVM packages in text categorization. The basic idea [3] is to iteratively decompose the big QP problem into smaller ones (called working set ) and solve them sequentially until convergence is obtained. The training time complexity for each iteration is: O(q ....
....is O(q NLv ) The number of iterations I is usually around one thousand for the Reuters 21578 benchmark corpus, for example, and sometimes can go beyond ten thousand. Note that I would be a#ected by the choice of q, making a purely theoretical complexity analysis di#cult. However, Joachims[7] showed a set Table 1: Complexities of classifiers in non hierarchical text categorization Classifier Training Time Testing Time Space Type of on M Categories per Document Complexity Algorithm SVM O(MN ) c 1.5 O(MLv ) O(NLv q ) binary kNN O(NLd ) O( L v ) O(N) O(NLv ) ....
T. Joachims. The Maximum-Margin Approach to Learning Text Classifiers: Methods, Theory, and Algorithms. Ph.D. thesis, University of Dortmund, 2000.
....unlabeled data: ExpectationMaximization (EM) 6] These experiments show that co training outperforms EM even on tasks where there is no natural split of features. Transductive Support Vector Machines(TSVM) proposed in [9] is another semi supervised learner that to some extent subsumes co training[10]. It uses labeled and unlabeled data to find the maximum margin hyper plane dividing the positive and negative instances. It is particularly beneficial in the situations where we do not care about good generalization, but rather good classification accuracy on a particular test set. In addition, ....
Thorsten Joachims. The Maximum Margin Approach to Learning Text Classifiers: Methods, Theory, and Algorithms. PhD thesis, Universitat Dortmund, 2000.
....to perform automatic feature selection in a Machine Learning scenario. 1.1 Support Vector Machines Support Vector Machines (SVMs, CV95] are a development from statis tical learning. They are based on the idea of structural risk minimisation ( Vap82] The following explanations are based on [Joa01]. The learning scenario is formalised as follows. Let S be a training set of N examples represented by , N from the vector space X = IR n. Each of these vectors is associated with a class y from a set Y: S = y) Here we only need the simplest case of binary classification, so we ....
....describe each training example. 1.2 Estimation of the generalisation error of SVMs This section deals with a specific property of SVMs that allows to estimate their generalisation error after one training run. This property was discov ered by Thomten Joachims and is described in more detail in [Joa01]. The proofs of the claims made here can be found there. Usually, after training a learner, its performance can only be determined on a separate set of examples that were not used for training, but whose classes are known. By comparing the known classes to the predicted ones, an empirical error ....
[Article contains additional citation context not shown here]
Thorsten Joachims. The Maximum-Margin Approach to Learning Text Classifiers: Methods, Theory, and Algorithms. PhD thesis, Fachbereich Informatik Universit&t Dortmund 2001.
....the feature space constructed through feature selection. Thereby, predictions can be performed efficiently. Besides performing comparative studies, we also take a more theoretical perspective to motivate why SVM learning method is suitable for our problem. Following studies conducted for text data [8], we discover that the success of SVM in predicting events has its foundations on statistical properties of event data. 2 Problem Settings We assume that a computer network is under continuous monitoring. Such monitoring process produces a sequence of events, where each event is a timestamped ....
....relies on the fact that event and text data share important statistical properties, that can be tied to the performance of SVMs. Here we discuss such properties for text data, and then show that similar ones hold for event data also. SVMs have been successfully applied for text classification. In [8], a theoretical learning model that connects SVMs with the statistical properties of text classification tasks has been presented. This model explains why and when SVMs perform well for text classification. The result is an upper bound connecting the expected generalization error of an SVM with ....
[Article contains additional citation context not shown here]
Joachims, T.(2000). The maximum margin approach to learning text classifiers: Methods, theory, and algorithms. Doctoral dissertation, Universitat Dortmund, Informatik, LS VIII.
.... x(j) In order to enhance the understanding consider the collection C (Xl, hl) x2, hl) x3, 1) x4, 1) and a ranking function f such that f(x3) f(x2) f(xl) f(x4) Then [x(1)x(2)x(3)x(4) x3,x2,xl,x4] x(1) x(2) x2,xl] fl ,f2 ] If (x2) f (Xl) and [i(1) i(2) [2,3]. In order to evaluate the quality of the ranking produced by f in C, we are going to use the previously defined measures of precision and recall. For this we introduce a threshold b R and construct the new classification function: hb(X) sign (f(x) b) 1.3) Precision and recall now ....
.... to Curve Bounds Learning theory allows us to bound the misclassification error much more tightly (at least in principle) using quantities specific to our trained classifier h, such as its margin, its leave one out error or the fraction of documents used for training (see, e.g. 1] In [2] leave one out type bounds are established on precision, recall and on the F1 measure. Unfortunately, these bounds apply to a single classifier hb and do not tell us how classifier hb, b b, will behave. This is of great importance because, in order to bound Ay (C) we need to compute precision and ....
[Article contains additional citation context not shown here]
T. Joachims. The Maximum-Margin Approach to Learning Text Classifiers Methods, Theory, and Algorithms. PhD thesis, Universitgt Dortmund, 2000.
....document. These cues are partly redundant. Table 1 [11] shows the results of an experiment on the Reuters corporate acquisitions category. All features (after stemming and stopword removal) are ranked according to their (binary) empirical mutual information (EMI) with the class label (cf. e.g. [14]) Then a naive step step step 1 2 3 Figure 3: Structure of the argument. used features by EMI rank PRBE 1 200 89.6 201 500 71.3 501 1000 63.3 1001 2000 58.0 2001 4000 55.4 4001 9947 47.5 random (no learning) 21.8 Table 1: Learning without using the best features. Bayes classi er is ....
....training error is a suf cient condition for good generalization accuracy. The second step abstracts the properties of text classi cation tasks into a model, which the third step connects to large margin separation. 4. 1 Step 1: Bounding the Expected Error Based on the Margin The following bound [14, 18] shows that large margin combined with low training error leads to high generalization accuracy. It uses results limiting the number of leave oneout errors [10, 13] The key quantities are the margin as de ned in Section 2, the maximum Euclidean length R of the document vectors x, and the ....
[Article contains additional citation context not shown here]
T. Joachims. The Maximum-Margin Approach to Learning Text Classiers: Methods, Theory, and Algorithms. PhD thesis, Universitat Dortmund, 2001. Kluwer, to appear.
No context found.
T. Joachims. The maximum-margin approach to learning text classi ers: methods, theory, and algorithms. PhD thesis, Arti cial Intelligence Unit, Department of Computer Science, University of Dortmund, February 2001.
No context found.
Thorsten Joachims. The Maximum-Margin Approach to Learning Text Classi ers: Methods, Theory, and Algorithms. PhD thesis, Computer Science Department, University of Dortmund, Germany, 2001.
No context found.
T. Joachims. The Maximum-Margin Approach to Learning Text Classifiers: Methods, Theory, and Algorithms. Ph.D. thesis, University of Dortmund, 2000.
No context found.
Joachims, T. (2000). The maximum margin approach to learning text classifiers methods, theory and algorithms.
No context found.
Thorsten Joachims. The Maximum Margin Approach to Learning Text Classifiers Methods, Theory and Algorithms. PhD thesis, Dortmund University, 2000.
No context found.
T. Joachims. The Maximum-Margin Approach to Learning Text Classi ers: Methods, Theory, and Algorithms. Ph.D. thesis, University of Dortmund, 2000.
No context found.
T. Joachims. The Maximum-Margin Approach to Learning Text Classifiers. Ausgezeichnete Informatikdissertationen 2001, D. Wagner et al. (Hrsg.). GI-Edition - Lecture Notes in Informatics (LNI), Kllen Verlag, Bonn, 2002.
No context found.
T. Joachims. The Maximum-Margin Approach to Learning Text Classi ers: Methods, Theory, and Algorithms. Ph.D. thesis, University of Dortmund, 2000.
No context found.
Thorsten Joachims. The Maximum-Margin Approach to Learning Text Classi ers: Methods, Theory, and Algorithms. PhD thesis, Computer Science Department, University of Dortmund, Germany, 2001.
No context found.
T. Joachims. The maximum-margin approach to learning text classi ers: methods, theory, and algorithms. PhD thesis, Arti cial Intelligence Unit, Department of Computer Science, University of Dortmund, February 2001.
No context found.
Thorsten Joachims. The Maximum-Margin Approach to Learning Text Classifiers: Methods, Theory, and Algorithms. PhD thesis, Universit at Dortmund, Germany, 2000. Fachbereich Informatik.
No context found.
Joachims, T. (2001). The Maximum-Margin Approach to Learning Text Classifiers: Methods, Theory and Algorithms. PhD thesis, Universitat Dortmund.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC