Results 1  10
of
89
Mining multilabel data
 In Data Mining and Knowledge Discovery Handbook
, 2010
"... A large body of research in supervised learning deals with the analysis of singlelabel data, where training examples are associated with a single label λ from a set of disjoint labels L. However, training examples in several application domains are often associated with a set of labels Y ⊆ L. Such d ..."
Abstract

Cited by 92 (9 self)
 Add to MetaCart
(Show Context)
A large body of research in supervised learning deals with the analysis of singlelabel data, where training examples are associated with a single label λ from a set of disjoint labels L. However, training examples in several application domains are often associated with a set of labels Y ⊆ L. Such data are called multilabel.
Multiinstance multilabel learning
 Artificial Intelligence
"... In this paper, we propose the MIML (MultiInstance MultiLabel learning) framework where an example is described by multiple instances and associated with multiple class labels. Compared to traditional learning frameworks, the MIML framework is more convenient and natural for representing complicate ..."
Abstract

Cited by 38 (16 self)
 Add to MetaCart
In this paper, we propose the MIML (MultiInstance MultiLabel learning) framework where an example is described by multiple instances and associated with multiple class labels. Compared to traditional learning frameworks, the MIML framework is more convenient and natural for representing complicated objects which have multiple semantic meanings. To learn from MIML examples, we propose the MimlBoost and MimlSvm algorithms based on a simple degeneration strategy, and experiments show that solving problems involving complicated objects with multiple semantic meanings in the MIML framework can lead to good performance. Consideringthat the degeneration process may lose information, we propose the DMimlSvm algorithm which tackles MIML problems directly in a regularization framework. Moreover, we show that even when we do not have access to the real objects and thus cannot capture more information from real objects by using the MIML representation, MIML is still useful. We propose the InsDif and SubCod algorithms. InsDif works by transforming singleinstances into the MIML representation for learning, while SubCod works by transforming singlelabel examples into the MIML representation for learning. Experiments show that in some tasks they are able to achieve better performance than learning the singleinstances or singlelabel examples directly.
Budgeted Social Choice: From Consensus to Personalized Decision Making
 PROCEEDINGS OF THE TWENTYSECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
, 2011
"... We develop a general framework for social choice problems in which a limited number of alternatives can be recommended to an agent population. In our budgeted social choice model, this limit is determined by a budget, capturing problems that arise naturally in a variety of contexts, and spanning the ..."
Abstract

Cited by 31 (6 self)
 Add to MetaCart
(Show Context)
We develop a general framework for social choice problems in which a limited number of alternatives can be recommended to an agent population. In our budgeted social choice model, this limit is determined by a budget, capturing problems that arise naturally in a variety of contexts, and spanning the continuum from pure consensus decision making (i.e., standard social choice) to fully personalized recommendation. Our approach applies a form of segmentation to social choice problems— requiring the selection of diverse options tailored to different agent types—and generalizes certain multiwinner election schemes. We show that standard rank aggregation methods perform poorly, and that optimization in our model is NPcomplete; but we develop fast greedy algorithms with some theoretical guarantees. Experiments on realworld datasets demonstrate the effectiveness of our algorithms.
Random kLabelsets for MultiLabel Classification
 IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 2010
"... A simple yet effective multilabel learning method, called label powerset (LP), considers each distinct combination of labels that exist in the training set as a different class value of a singlelabel classification task. The computational efficiency and predictive performance of LP is challenged b ..."
Abstract

Cited by 30 (2 self)
 Add to MetaCart
A simple yet effective multilabel learning method, called label powerset (LP), considers each distinct combination of labels that exist in the training set as a different class value of a singlelabel classification task. The computational efficiency and predictive performance of LP is challenged by application domains with large number of labels and training examples. In these cases the number of classes may become very large and at the same time many classes are associated with very few training examples. To deal with these problems, this paper proposes breaking the initial set of labels into a number of small random subsets, called labelsets and employing LP to train a corresponding classifier. The labelsets can be either disjoint or overlapping depending on which of two strategies is used to construct them. The proposed method is called RAkEL (RAndom k labELsets), where k is a parameter that specifies the size of the subsets. Empirical evidence indicate that RAkEL manages to improve substantially over LP, especially in domains with large number of labels and exhibits competitive performance against other highperforming multilabel learning methods.
Multidimensional classification with Bayesian networks.
 International Journal of Approximate Reasoning,
, 2011
"... Multidimensional classification aims at finding a function that assigns a vector of class values to a given vector of features. In this paper, this problem is tackled by a general family of models, called multidimensional Bayesian network classifiers (MBCs). This probabilistic graphical model org ..."
Abstract

Cited by 24 (7 self)
 Add to MetaCart
(Show Context)
Multidimensional classification aims at finding a function that assigns a vector of class values to a given vector of features. In this paper, this problem is tackled by a general family of models, called multidimensional Bayesian network classifiers (MBCs). This probabilistic graphical model organizes class and feature variables as three different subgraphs: class subgraph, feature subgraph, and bridge (from class to features) subgraph. Under the standard 01 loss function, the most probable explanation (MPE) must be computed, for which we provide theoretical results in both general MBCs and in MBCs decomposable into maximal connected components. Moreover, when computing the MPE, the vector of class values is covered by following a special ordering (gray code). Under other loss functions defined in accordance with a decomposable structure, we derive theoretical results on how to minimize the expected loss. Besides these inference issues, the paper presents flexible algorithms for learning MBC structures from data based on filter, wrapper and hybrid approaches. The cardinality of the search space is also given. New performance evaluation metrics adapted from the singleclass setting are introduced. Experimental results with three benchmark data sets are encouraging, and they outperform stateoftheart algorithms for multilabel classification.
An active learning algorithm for ranking from pairwise preferences with an almost optimal query complexity
, 2012
"... ar ..."
(Show Context)
Label Ranking Algorithms: A Survey
"... Abstract. Label ranking is a complex prediction task where the goal is to map instances to a total order over a finite set of predefined labels. An interesting aspect of this problem is that it subsumes several supervised learning problems such as multiclass prediction, multilabel classification and ..."
Abstract

Cited by 19 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Label ranking is a complex prediction task where the goal is to map instances to a total order over a finite set of predefined labels. An interesting aspect of this problem is that it subsumes several supervised learning problems such as multiclass prediction, multilabel classification and hierarchical classification. Unsurpisingly, there exists a plethora of label ranking algorithms in the literature due, in part, to this versatile nature of the problem. In this paper, we survey these algorithms. 1
The Unavailable Candidate Model: A DecisionTheoretic View of Social Choice
"... One of the fundamental problems in the theory of social choice is aggregating the rankings of a set of agents (or voters) into a consensus ranking. Rank aggregation has found application in a variety of computational contexts. However, the goal of constructing a consensus ranking rather than, say, a ..."
Abstract

Cited by 18 (5 self)
 Add to MetaCart
(Show Context)
One of the fundamental problems in the theory of social choice is aggregating the rankings of a set of agents (or voters) into a consensus ranking. Rank aggregation has found application in a variety of computational contexts. However, the goal of constructing a consensus ranking rather than, say, a single outcome (or winner) is often left unjustified, calling into question the suitability of classical rank aggregation methods. We introduce a novel model which offers a decisiontheoretic motivation for constructing a consensus ranking. Our unavailable candidate model assumes that a consensus choice must be made, but that candidates may become unavailable after voters express their preferences. Roughly speaking, a consensus ranking serves as a compact, easily communicable representation of a decision policy that can be used to make choices in the face of uncertain candidate availability. We use this model to define a principled aggregation method that minimizes expected voter dissatisfaction with the chosen candidate. We give exact and approximation algorithms for computing optimal rankings and provide computational evidence for the effectiveness of a simple greedy scheme. We also describe strong connections to popular voting protocols such as the plurality rule and the Kemeny consensus, showing specifically that Kemeny produces optimal rankings in the unavailable candidate model under certain conditions.
Combining Predictions in Pairwise Classification: An Optimal Adaptive Voting Strategy and Its Relation to Weighted Voting
 TO APPEAR IN PATTERN RECOGNITION
, 2009
"... Weighted voting is the commonly used strategy for combining predictions in pairwise classification. Even though it shows good classification performance in practice, it is often criticized for lacking a sound theoretical justification. In this paper, we study the problem of combining predictions wit ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
Weighted voting is the commonly used strategy for combining predictions in pairwise classification. Even though it shows good classification performance in practice, it is often criticized for lacking a sound theoretical justification. In this paper, we study the problem of combining predictions within a formal framework of label ranking and, under some model assumptions, derive a generalized voting strategy in which predictions are properly adapted according to the strengths of the corresponding base classifiers. We call this strategy adaptive voting and show that it is optimal in the sense of yielding a MAP prediction of the class label of a test instance. Moreover, we offer a theoretical justification for weighted voting by showing that it yields a good approximation of the optimal adaptive voting prediction. This result is further corroborated by empirical evidence from experiments with real and synthetic data sets showing that, even though adaptive voting is sometimes able to achieve consistent improvements, weighted voting is in general quite competitive, all the more in cases where the aforementioned model assumptions underlying adaptive voting are not met. In this sense, weighted voting appears to be a more robust aggregation strategy.
Predicting partial orders: ranking with abstention
 in: Machine Learning and Knowledge Discovery in Databases
, 2010
"... Abstract. The prediction of structured outputs in general and rankings in particular has attracted considerable attention in machine learning in recent years, and different types of ranking problems have already been studied. In this paper, we propose a generalization or, say, relaxation of the stan ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
(Show Context)
Abstract. The prediction of structured outputs in general and rankings in particular has attracted considerable attention in machine learning in recent years, and different types of ranking problems have already been studied. In this paper, we propose a generalization or, say, relaxation of the standard setting, allowing a model to make predictions in the form of partial instead of total orders. We interpret such kind of prediction as a ranking with partial abstention: If the model is not sufficiently certain regarding the relative order of two alternatives and, therefore, cannot reliably decide whether the former should precede the latter or the other way around, it may abstain from this decision and instead declare these alternatives as being incomparable. We propose a general approach to ranking with partial abstention as well as evaluation metrics for measuring the correctness and completeness of predictions. For two types of ranking problems, we show experimentally that this approach is able to achieve a reasonable tradeoff between these two criteria. 1