37 citations found. Retrieving documents...
Friedman, J. (2004), `Another approach to polychotomous classification', Machine Learning .To appear.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

New Results on Error Correcting Output Codes of Kernel.. - Passerini, Pontil.. (2003)   (Correct)

....is assigned to the class whose associated classifier has the maximum output. In the ECOC framework one vs all is equivalent to linear decoding with a QQ coding matrix whose entries are always 1 expect diagonal entries which are equal to 1. In the latter approach, also known as pairwise coupling [18] or round robin classification [19] there are Q(Q 1) 2 classifiers, each separating a pair of classes. Classification is decided by majority voting. This scheme is equivalent to Hamming decoding with the appropriate coding matrix. When all binary classifiers are computed by the same learning ....

....interesting observation is that the Hamming distance works well in the case of pairwise classification, while it performs poorly with onevs all classifiers. Both results are not surprising: the Hamming distance corresponds to the majority vote, which is known to work well for pairwise classifiers [18] but does not make much sense for one vs all because in this case ties may occur often. The behavior of all curves shows that tuning kernel parameters may significantly improve performance. We also note that a simple encoding scheme such as one vs all performs well with respect to more complex ....

J. H. Friedman, "Another approach to polychotomous classification, " Department of Statistics, Stanford University, Tech. Rep., 1996.


Multiclass Alternating Decision Trees - Georey Holmes Bernhard (2002)   (1 citation)  (Correct)

....vote towards the class labels they represent. Provided there is su#cient class representation and separation between the subsets, the vote tallies for individual class labels can be collected to form a reasonable prediction. We experimented with a number of subset generation schemes: 1 against 1 [9, 1]: generate a tree for every pair of classes, where subset A contains only the first class and subset B contains only the second. An advantage of this approach is that each tree need only be trained with a subset of the data, resulting in faster learning [11] 1 against rest: one tree per class, ....

Jerome Friedman. Another approach to polychotomous classification. Technical report, Stanford University, Department of Statistics, 1996.


Limited Hierarchical Fusion of Multiple Classifiers for.. - Kumar, Ghosh, Crawford (2002)   (Correct)

....selectors in a hierarchical framework is provided in Chakrabarti et al. [34] where a hierarchical topic taxonomy is used to organise large text databases. Decomposing a C class problem into two class problems, one for each unique pair of classes, has also been proposed previously. Friedman [35] and Tibshirani [36] proposed different methods to combine the outputs of the two class models to yield an overall C class system. We have investigated the pairwise classifier framework [37] in detail, and applied it to several remote sensing [8,38,6] and machine learning problems [7] Such ....

Friedman J. Another approach to polychotomous classification. Technical report, Stanford Univeristy, 1996


Classification Of Gene Expression Data by Pairwise Comparisons - Michailidis   (Correct)

....combined to form a K class one. This combination rule is quite intuitive, since the test observation is assigned to the class that wins the most pairwise comparisons. In case the class posterior probabilities are known, Friedman shows that this rule is equivalent to the corresponding Bayes rule [4]. 3. DATA EXAMPLE: CHILDHOOD CANCER DATASET The proposed approach is applied to a data set dealing with small, round blue cell tumors (SRBCT) of childhood coming from a study by Khan et al. 8] There are four classes of SRBCTs: neuroblastomas (NB) rhabdommyosarcomas (RMS) Burkitt lymphomas ....

Friedman, J. (1996), Another approach to polychotomous classification, Technical Report, Department of Statistics, Stanford University


Combining Flat and Structured Representations for.. - Yao, Marcialis.. (2003)   (1 citation)  (Correct)

....codes Many real world classification problems involve more than two classes. Attempts to solve q class problems with SVM have involved training q SVM, each of which separates a single class from all remaining classes [36] or training q(q 1) machines, each of which separates a pair of classes [30, 11, 32]. The first type of classifiers are usually called one vs all, while classifiers of the second type are called pairwise classifiers. When the one vs all classifiers are used, a test point is classified into the class whose associated classifier has the highest score among all classifiers. In the ....

....When the one vs all classifiers are used, a test point is classified into the class whose associated classifier has the highest score among all classifiers. In the case of pairwise classifiers, a test point is classified in the class which gets most votes among all the possible classifiers [11]. Classification schemes based on training one vs all and pairwise classifiers are two extreme approaches: the first uses all the data, the second the smallest portion of the data. In practice, it can be more e#ective to use intermediate classification strategies in the style of error correcting ....

Jerome H. Friedman. Another approach to polychotomous classification. Technical report, Department of Statistics, Stanford University, 1997. 18


From Margins to Probabilities in Multiclass Learning.. - Passerini, Pontil, Frasconi (2002)   (Correct)

....decoding. Notice that the Hamming distance works well in the case of pairwise classification, while it performs poorly with one vs all classifiers. Both results are not surprising: the Hamming distance corresponds to the majority vote, which is known to work well for pairwise classifiers [7] but does not make much sense for one vs all because in this case ties may occur often. 4 Tuning Kernel Parameters In this section we study the problem of model selection in the case of ECOC of Kernel Machines [12, 11, 6, 3] The analysis uses a leave one out error estimate as the key quantity ....

Jerome H. Friedman, `Another approach to polychotomous classification ', Technical report, Department of Statistics, Stanford University, (1997).


Combined Binary Classifiers With Applications To Speech.. - Klautau, Jevtic, Orlitsky (2002)   (Correct)

....classes. Their respective matrices for K = 4 are 1 1 1 0 0 0 1 0 0 1 1 0 0 1 0 1 0 1 0 0 1 0 1 1 5 and 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 5 : The all pairs code is related to well known methods of paired comparisons in statistics [11] and was applied to classification problems in [12]. While there is empirical evidence that ECOC consisting of fewer binary classifiers may outperform all pairs codes [3, 13] and ECOC design methods are under investigation [14, 15] several researchers have adopted the all pairs with promising results [16, 17, 18, 19, 20] In speech recognition, ....

J. Friedman. Another approach to polychotomous classification. Technical report, Stanford University, 1996.


Pairwise Face Recognition - Guo-Dong Guo Hong-Jiang (2001)   (1 citation)  (Correct)

....# is classified to class # in the pairwise competition between classes # and #, otherwise # ### #####. All elements in the main diagonal are zeros. The outputs of the pairwise classifiers should be combined to obtain the final decision. There are two ways to combine them: 1) by simple voting [5], or (2) by using the MAP rule on an estimate of the overall a posteriori probabilities obtained from the outputs of the pairwise classifiers [7] In the voting combination scheme, a count ### # ### of the number of the pairwise classifiers that label # into class # # # # # # # # # # ### # ....

J. Friedman. Another approach to polychotomous classification. Technical report, Stanford University, 1996.


Support Vector Machines for Phoneme Classification - Salomon (2001)   (Correct)

....binary classifiers, techniques are needed to extend the method to handle multiple classes. The goal of such a technique is to map the generalisation abilities of the binary classifiers to the multi class domain. In literature, numerous schemes have been proposed to solve this problem. For example [16], 39] 53] 21] 26] and [15] This section will not try to describe all of these, but concentrate on the ones that have shown to produce good generalisation performance in practice. 11 To go into more details of this optimisation procedure would require an extensive description. Refer to ....

....of this optimisation procedure would require an extensive description. Refer to [46] for a detailed description. 12 SVM Light and SVMTorch. See [29] and [12] Chapter 2. Support Vector Machines 27 2.8.1 One vs. One classifier The One vs. One classifier is a system proposed by Friedmann ([16]) and has become the most popular and successful multi class SVM method. The principle behind the method is very simple. It creates a binary SVM for each combination of classes 13 possible, and each unseen example are classified to the class that wins most binary classifications 14 . This ....

J. Friedman. Another approach to polychotomous classification. Technical report, Stanford University, UA, 1996.


Pairwise Coupling for Machine Recognition of Hand-Printed.. - Roth, Tsuda (2001)   (1 citation)  (Correct)

....a conceptual nature and can be overcome by a different approach to the multi class problem: instead of solving one against all problems, we might solve pairwise classification problems, and try to couple the probabilities in a suitable way. Methods of this kind have been introduced in [1], 2] and are referred to as pairwise coupling. Learning such pairwise decision rules may be a much simpler problem than separating each class from the others. In this paper special emphasis is put on nonlinear Mercer kernel based classifiers. A recent overview over kernel methods can be found in ....

....of pairwise coupling over conventional multi class approaches. Consider for example 4 The nporq can be interpreted as a conditional probability estimate for the membership of vector o in class when separating its class only from class , without considering any of the other classes, cf. [1]. three classes as depicted in figure 1. In a pairwise approach, each of the three pairs can be separated by a linear decision boundary without errors. If, on the other hand, we try to separate each class from the two others, the classes are not linearly separable. In order to avoid training ....

J. Friedman, "Another approach to polychotomous classification," Tech. Rep., Stanford University, 1996.


Fingerprint Classification with Combinations of Support.. - Yao, Frasconi, Pontil (2001)   (2 citations)  (Correct)

....Manyr eal wor ld classification pr oblems involve mor e than two classes. Attempts to solve q classpr oblems with SVMs have involved tr aining q SVMs, each of which separ ates a single class fr om allr emaining classes [17] or tr aining q 2 machines, each of which separ ates apair of classes [14,6,15]. The fir st type of classifier sar e usually called one vs all, while classifier s of the second typear e called pairwise classifieri For the one vs all a test point is classified into the class whose associated classifier has the highest scor e among all classifieri In the pair wise classifier ....

.... typear e called pairwise classifieri For the one vs all a test point is classified into the class whose associated classifier has the highest scor e among all classifieri In the pair wise classifier , a test point is classified in the class which gets most votes among all the possible classifier [6]. Classification schemes based on tr ining one vs all and pair ise classifier ar e twoextr eme appr oaches: the fir st uses all the data, the second the smallest por tion of the data. In pr actice, it can be mor e e#ective to use inter mediate classification str ategies in the style of er r or cor ....

Jerome H. Friedman. Another approach to polychotomous classification. Technical report, Department of Statistics, StanfW8 University, 1997.


Round Robin Rule Learning - Fürnkranz (2001)   (Correct)

....the total learning effort is only linear in the number of classes, and may in some circumstances even be smaller than the effort needed for an unordered binarization. The analysis independent of the base learning algorithm used. Some of the ideas have already been sketched in a short paragraph by Friedman (1996), but we go into considerably more detail here, and, in particular, focus on the comparison to conventional class binarization techniques. Definition 5.1 (class penalty) If the base learner has a complexity growth function (i.e. the time needed for a example training set is ) ....

....optimization phases can slow down the algorithm considerably, and seem to dominate the run time (but we have not performed a thorough analysis of this issue) Figure 2. Error reductions ratios of boosting vs. round robin. pairwise classification can be entirely parallelized (as already noted by Friedman (1996) and Lu and Ito (1999) As each binary task will be smaller than the original task, the total training time of a multi class problem of size will be significantly below that of a binary problem of the same size, if each binary classifier can be trained on a separate processor. Memory ....

[Article contains additional citation context not shown here]

Friedman, J. H. (1996). Another approach to polychotomous classification (Technical Report). Department of Statistics, Stanford University, Stanford, CA.


Fingerprint Classification with Combinations of Support.. - Yao, Frasconi, Pontil (2001)   (2 citations)  (Correct)

....Many real world classification problems involve more than two classes. Attempts to solve q class problems with SVMs have involved training q SVMs, each of which separates a single class from all remaining classes [17] or training q 2 machines, each of which separates a pair of classes [14, 6, 15]. The first type of classifiers are usually called one vs all, while classifiers of the second type are called pairwise classifiers. For the one vs all a test point is classified into the class whose associated classifier has the highest score among all classifiers. In the pairwise classifier, a ....

....type are called pairwise classifiers. For the one vs all a test point is classified into the class whose associated classifier has the highest score among all classifiers. In the pairwise classifier, a test point is classified in the class which gets most votes among all the possible classifiers [6]. Classification schemes based on training one vs all and pairwise classifiers are two extreme approaches: the first uses all the data, the second the smallest portion of the data. In practice, it can be more e#ective to use intermediate classification strategies in the style of error correcting ....

Jerome H. Friedman. Another approach to polychotomous classification. Technical report, Department of Statistics, Stanford University, 1997.


Stochastic Organization of Output Codes in Multiclass.. - Utschick, Weichselberger (2001)   (6 citations)  (Correct)

....which uniquely assigns each pattern x 2 R N to one element of a finite set of K classes Omega = f Omega 1 ; Omega 2 ; Omega K g. The case of K = 2 defines a two class problem (dichotomous classification) K 2 addresses a multiclass problem (cf. polychotomous classification, Fri96] Due to the uncertainty of statistical parameters in practical applications, the decision rule is constructed solely with the knowledge of the true class indices, k m 2 f1; 2; Kg, of M given training examples S = n (x 1 ; k 1 ) x 2 ; k 2 ) x M ; k M ) o . ....

....k ; k) o denotes the set of training examples of class Omega k , and M = M 1 M 2 : MK for jS k j = M k . 1. 1 Decomposition into Dichotomies Numerous works have shown that breaking down a multiclass problem into a series of two class problems can be advantageous [AMMR95, DB95, MM96, Fri96, GH97, Sch97a] Decomposing a polychotomy into D dichotomies means defining D partitions ( Omega d ; Omega d Gamma ) on Omega . Each of these partitions defines a two class problem to be learned by the corresponding component classifier or plug in classifier (PiC) cf. GH97] The set ....

J.H. Friedman. Another approach to polychotomous classification, 1996. 44


Multi-class AdaBoost - Ji Zhu University   (Correct)

No context found.

Friedman, J. (2004), `Another approach to polychotomous classification', Machine Learning .To appear.


On Multiclass Active Learning with - Support Vector Machines (2004)   (Correct)

No context found.

J. H. Friedman, `Another approach to polychotomous classification', Technical report, Department of Statistics, Stanford University, Stanford, CA, (1996).


CBSA: Content-based Soft Annotation for Multimodal Image.. - Chang, Goh, Sychay, Wu (2003)   (1 citation)  (Correct)

No context found.

J. Friedman. Another approach to polychotomous classification. Stanford University Technical Report, 1996.


Probability Estimates for Multi-class Classification by.. - Wu, Lin, Weng (2003)   (7 citations)  (Correct)

No context found.

J. Friedman. Another approach to polychotomous classification. Technical report, Department of Statistics, Stanford University, 1996. Available at http://www-stat.stanford.edu/reports/friedman/poly.ps.Z.


Probability Estimates for Multi-class Classification by.. - Wu, Lin, Weng (2003)   (7 citations)  (Correct)

No context found.

J. Friedman. Another approach to polychotomous classification. Technical report, Department of Statistics, Stanford University, 1996. Available at http://www-stat.stanford.edu/reports/friedman/poly.ps.Z.


Identifying Painters From Color Profiles Of Skin Patches - In Painting Images (2003)   (Correct)

No context found.

J. Friedman, "Another approach to polychotomous classification, " Tech. Rep., Department of Statistics, Stanford University, 1996.


Handling Uncertain Labels in Multiclass Problems Using.. - Vannoorenberghe, Denoeux (2002)   (1 citation)  (Correct)

No context found.

J.H. Friedman. Another approach to polychotomous classification. Technical report, Department of Statistics, Stanford University, 1996.


On Tuning Hyper-Parameters of Multiclass Margin Classifiers - Passerini, Pontil, Frasconi (2002)   (Correct)

No context found.

Jerome H. Friedman, `Another approach to polychotomous classification', Technical report, Department of Statistics, Stanford University, (1997).


Support Vector Machines for Thai Phoneme Recognition - Thubthong, Kijsirikul (2001)   (Correct)

No context found.

J. H. Friedman, "Another approach to polychotomous classification", Technical report, Department of Statistics, Stanford, 1996.


Smoothing Spline Analysis Of Variance For Polychotomous Response.. - Lin (1998)   (Correct)

No context found.

J.H. Friedman. Another approach to polychotomous classification. Technical report, Department of Statistics, Stanford University, 1996.


Multicategory Support Vector Machines - Yi (2001)   (8 citations)  (Correct)

No context found.

Friedman, J. H. (1997). Another approach to polychotomous classification. Technical report, Department of Statistics, Stanford University.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC