Results 1  10
of
15
Minimax Learning Rates for Bipartite Ranking and Plugin Rules
"... While it is now wellknown in the standard binary classification setup, that, under suitable margin assumptions and complexity conditions on the regression function, fast or even superfast rates (i.e. rates faster than n −1/2 or even faster than n −1) can be achieved by plugin classifiers, no resu ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
While it is now wellknown in the standard binary classification setup, that, under suitable margin assumptions and complexity conditions on the regression function, fast or even superfast rates (i.e. rates faster than n −1/2 or even faster than n −1) can be achieved by plugin classifiers, no result of this nature has been proved yet in the context of bipartite ranking, though akin to that of classification. It is the main purpose of the present paper to investigate this issue, by considering bipartite ranking as a nested continuous collection of costsensitive classification problems. A global low noise condition is exhibited under which certain (plugin) ranking rules are proved to achieve fast (but not superfast) rates over a wide nonparametric class of models. A lower bound result is also stated in a specific situation, establishing that such rates are optimal from a minimax perspective. 1.
Nested Support Vector Machines
, 2008
"... The oneclass and costsensitive support vector machines (SVMs) are stateoftheart machine learning methods for estimating density level sets and solving weighted classification problems, respectively. However, the solutions of these SVMs do not necessarily produce set estimates that are nested as ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
The oneclass and costsensitive support vector machines (SVMs) are stateoftheart machine learning methods for estimating density level sets and solving weighted classification problems, respectively. However, the solutions of these SVMs do not necessarily produce set estimates that are nested as the parameters controlling the density level or costasymmetry are continuously varied. Such a nesting constraint is desirable for applications requiring the simultaneous estimation of multiple sets, including clustering, anomaly detection, and ranking problems. We propose new quadratic programs whose solutions give rise to nested extensions of the oneclass and costsensitive SVMs. Furthermore, like conventional SVMs, the solution paths in our construction are piecewise linear in the control parameters, with significantly fewer breakpoints. We also describe decomposition algorithms to solve the quadratic programs. These methods are compared to conventional SVMs on synthetic and benchmark data sets, and are shown to exhibit more stable rankings and decreased sensitivity to parameter settings.
The false discovery rate for statistical pattern recognition
"... Abstract: The false discovery rate (FDR) and false nondiscovery rate (FNDR) have received considerable attention in the literature on multiple testing. These performance measures are also appropriate for classification, and in this work we develop generalization error analyses for FDR and FNDR when ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract: The false discovery rate (FDR) and false nondiscovery rate (FNDR) have received considerable attention in the literature on multiple testing. These performance measures are also appropriate for classification, and in this work we develop generalization error analyses for FDR and FNDR when learning a classifier from labeled training data. Unlike more conventional classification performance measures, the empirical FDR and FNDR are not binomial random variables but rather a ratio of binomials, which introduces challenges not present in conventional formulations of the classification problem. We develop distributionfree uniform deviation bounds and apply these to obtain finite sample bounds and strong universal consistency. We also present a simulation study demonstrating the merits of variancebased bounds, which we also develop. In the context of multiple testing with FDR/FNDR, our framework may be viewed as a way to leverage training data to achieve distribution free, asymptotically
Machine Learning manuscript No.
, 2009
"... (will be inserted by the editor) Adaptive partitioning schemes for bipartite ranking How to grow and prune a ranking tree ..."
Abstract
 Add to MetaCart
(Show Context)
(will be inserted by the editor) Adaptive partitioning schemes for bipartite ranking How to grow and prune a ranking tree
Batch and online learning algorithms for Nonconvex NeymanPearson classification
"... We describe and evaluate two algorithms for NeymanPearson (NP) classification problem which has been recently shown to be of a particular importance for bipartite ranking problems. NP classification is a nonconvex problem involving a constraint on false negatives rate. We investigated batch algorit ..."
Abstract
 Add to MetaCart
We describe and evaluate two algorithms for NeymanPearson (NP) classification problem which has been recently shown to be of a particular importance for bipartite ranking problems. NP classification is a nonconvex problem involving a constraint on false negatives rate. We investigated batch algorithm based on DC programming and stochastic gradient method well suited for large scale datasets. Empirical evidences illustrate the potential of the proposed methods.
JOURNAL OF L AT E X CLASS FILES 1 Ranking Forests
"... Abstract—It is the goal of this paper to examine how the aggregation and feature randomization principles underlying the algorithm RANDOM FOREST [1], originally proposed in the classification/regression setup, can be adapted to bipartite ranking, in order to increase the performance of scoring rules ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—It is the goal of this paper to examine how the aggregation and feature randomization principles underlying the algorithm RANDOM FOREST [1], originally proposed in the classification/regression setup, can be adapted to bipartite ranking, in order to increase the performance of scoring rules produced by the TREERANK algorithm [2], a recently developed tree induction method, specifically tailored for this global learning problem. Since TREERANK may be viewed as a recursive implementation of a costsensitive version of the popular classification algorithm CART [3], with a cost locally depending on the data lying within the node to split, various strategies can be considered for ”randomizing ” the features involved in the tree growing stage. In parallel, several ways of combining/averaging ranking trees may be used, including techniques inspired from rank aggregation methods recently popularized in Web applications. Ranking procedures based on such approaches are called RANKING FORESTS. Beyond preliminary theoretical background, results of experiments based on simulated data are provided in order to give evidence of their statistical performance. Index Terms—Bipartite Ranking, data with binary labels, ROC optimization, AUC criterion, treebased ranking rules, bootstrap, bagging, rank aggregation, median procedure, feature randomization. hal00452577, version 1 2 Feb 2010
Batch and online learning algorithms for Nonconvex NeymanPearson classification
"... We describe and evaluate two algorithms for NeymanPearson (NP) classification problem which has been recently shown to be of a particular importance for bipartite ranking problems. NP classification is a nonconvex problem involving a constraint on false negatives rate. We investigated batch algorit ..."
Abstract
 Add to MetaCart
We describe and evaluate two algorithms for NeymanPearson (NP) classification problem which has been recently shown to be of a particular importance for bipartite ranking problems. NP classification is a nonconvex problem involving a constraint on false negatives rate. We investigated batch algorithm based on DC programming and stochastic gradient method well suited for large scale datasets. Empirical evidences illustrate the potential of the proposed methods.
Observations Z1,...,Zn are IID
, 2009
"... Empirical risk minimization with statistics of higher order ..."
Contents
, 2012
"... Abstract: One main focus of learning theory is to find optimal rates of convergence. In classification, it is possible to obtain optimal fast rates (faster than n −1/2) in a minimax sense. Moreover, using an aggregation procedure, the algorithms are adaptive to the parameters of the class of distrib ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract: One main focus of learning theory is to find optimal rates of convergence. In classification, it is possible to obtain optimal fast rates (faster than n −1/2) in a minimax sense. Moreover, using an aggregation procedure, the algorithms are adaptive to the parameters of the class of distributions. Here, we investigate this issue in the bipartite ranking framework. We design a ranking rule by aggregating estimators of the regression function. We use exponential weights based on the empirical ranking risk. Under several assumptions on the class of distribution, we show that this procedure is adaptive to the margin parameter and smoothness parameter and achieves the same rates as in the classification framework. Moreover, we state a minimax lower bound that establishes the optimality of the
Ranking MultiClass Data: Optimality and Pairwise Aggregation
, 2011
"... Abstract It is the primary purpose of this paper to set the goals of ranking in a multipleclass context rigorously, following in the footsteps of recent results in the bipartite framework. Under specific likelihood ratio monotonicity conditions, optimal solutions for this global learning problem ar ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract It is the primary purpose of this paper to set the goals of ranking in a multipleclass context rigorously, following in the footsteps of recent results in the bipartite framework. Under specific likelihood ratio monotonicity conditions, optimal solutions for this global learning problem are described in the ordinal situation, i.e. when there exists a natural order on the set of labels. Criteria reflecting ranking performance under these conditions such as the ROC surface and its natural summary, the volume under the ROC surface (VUS), are next considered as targets for empirical optimization. Whereas plugin techniques or the Empirical Risk Maximization principle can be then easily extended to the ordinal multiclass setting, reducing the Kpartite ranking task to the solving of a collection of bipartite ranking problems, following in the footsteps of the pairwise comparison approach in classification, is in contrast more challenging. Here we consider a concept of ranking rule consensus based on the Kendall τ distance and show that, when it exists and is based on consistent ranking rules for the bipartite ranking subproblems defined by all consecutive pairs of labels, the latter forms a consistent ranking rule in the VUS sense under adequate conditions. This result paves the way for extending the use of recently developed learning algorithms, tailored for bipartite ranking, to multiclass data in a valid theoretical framework. Preliminary experimental results are presented for illustration purpose. 1