Results 1 - 10
of
84
Maltparser: A language-independent system for data-driven dependency parsing
- In Proc. of the Fourth Workshop on Treebanks and Linguistic Theories
, 2005
"... ..."
A hybrid learning system for recognizing user tasks from desktop activities and email messages
- In Proc. of IUI-06
, 2006
"... The TaskTracer system seeks to help multi-tasking users manage the resources that they create and access while carrying out their work activities. It does this by associating with each user-defined activity the set of files, folders, email messages, contacts, and web pages that the user accesses whe ..."
Abstract
-
Cited by 34 (10 self)
- Add to MetaCart
The TaskTracer system seeks to help multi-tasking users manage the resources that they create and access while carrying out their work activities. It does this by associating with each user-defined activity the set of files, folders, email messages, contacts, and web pages that the user accesses when performing that activity. The initial TaskTracer system relies on the user to notify the system each time the user changes activities. However, this is burdensome, and users often forget to tell TaskTracer what activity they are working on. This paper introduces TaskPredictor, a machine learning system that attempts to predict the user’s current activity. TaskPredictor has two components: one for general desktop activity and another specifically for email. TaskPredictor achieves high prediction precision by combining three techniques: (a) feature selection via mutual information, (b) classification based on a confidence threshold, and (c) a hybrid design in which a Naive Bayes classifier estimates the classification confidence but where the actual classification decision is made by a support vector machine. This paper provides experimental results on data collected from Task-Tracer users.
Combining svms with various feature selection strategies
- Taiwan University
, 2005
"... Feature selection is an important issue in many research areas. There are some reasons for selecting important features such as reducing the learning time, improving the accuracy, etc. This thesis investigates the performance of combining support vector machines (SVM) and various feature selection s ..."
Abstract
-
Cited by 33 (0 self)
- Add to MetaCart
Feature selection is an important issue in many research areas. There are some reasons for selecting important features such as reducing the learning time, improving the accuracy, etc. This thesis investigates the performance of combining support vector machines (SVM) and various feature selection strategies. The first part of the thesis mainly describes the existing feature selection methods and our experience on using those methods to attend a competition. The second part studies more feature selection strategies using the SVM. ii �ì��¬¡÷ � ��å�ç¢�ß��� � selection)��¥ì����£��È�� ����È������Ú���£����æÁ ç��£�����û�� ì�Öù�¡�È��(feature é£�æÁ©Â����℄���� � �Ü � ����Æ���È��℄�¡��û���℄�ø�¢�§���� �(Support Vector Machine) iii
Preference learning with gaussian processes
- In Proc. ICML*2005
, 2005
"... In this paper, we propose a probabilistic kernel approach to preference learning based on Gaussian processes. A new likelihood function is proposed to capture the preference relations in the Bayesian framework. The generalized formulation is also applicable to tackle many multiclass problems. The ov ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
In this paper, we propose a probabilistic kernel approach to preference learning based on Gaussian processes. A new likelihood function is proposed to capture the preference relations in the Bayesian framework. The generalized formulation is also applicable to tackle many multiclass problems. The overall approach has the advantages of Bayesian methods for model selection and probabilistic prediction. Experimental results compared against the constraint classification approach on several benchmark datasets verify the usefulness of this algorithm. 1.
Label Ranking by Learning Pairwise Preferences
"... Preference learning is an emerging topic that appears in different guises in the recent literature. This work focuses on a particular learning scenario called label ranking, where the problem is to learn a mapping from instances to rankings over a finite number of labels. Our approach for learning s ..."
Abstract
-
Cited by 20 (8 self)
- Add to MetaCart
Preference learning is an emerging topic that appears in different guises in the recent literature. This work focuses on a particular learning scenario called label ranking, where the problem is to learn a mapping from instances to rankings over a finite number of labels. Our approach for learning such a mapping, called ranking by pairwise comparison (RPC), first induces a binary preference relation from suitable training data using a natural extension of pairwise classification. A ranking is then derived from the preference relation thus obtained by means of a ranking procedure, whereby different ranking methods can be used for minimizing different loss functions. In particular, we show that a simple (weighted) voting strategy minimizes risk with respect to the well-known Spearman rank correlation. We compare RPC to existing label ranking methods, which are based on scoring individual labels instead of comparing pairs of labels. Both empirically and theoretically, it is shown that RPC is superior in terms of computational efficiency, and at least competitive in terms of accuracy.
Generalized bradley-terry models and multi-class probability estimates
- Journal of Machine Learning Research
"... Editor: The Bradley-Terry model for obtaining individual skill from paired comparisons has been popular in many areas. In machine learning, this model is related to multi-class probability estimates by coupling all pairwise classification results. Error correcting output codes (ECOC) are a general f ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
Editor: The Bradley-Terry model for obtaining individual skill from paired comparisons has been popular in many areas. In machine learning, this model is related to multi-class probability estimates by coupling all pairwise classification results. Error correcting output codes (ECOC) are a general framework to decompose a multi-class problem to several binary problems. To obtain probability estimates under this framework, this paper introduces a generalized Bradley-Terry model in which paired individual comparisons are extended to paired team comparisons. We propose a simple algorithm with convergence proofs to solve the model and obtain individual skill. Experiments on synthetic and real data demonstrate that the algorithm is useful for obtaining multi-class probability estimates. Moreover, we discuss four extensions of the proposed model: 1) weighted individual skill, 2) home-field advantage, 3) ties, and 4) comparisons with more than two teams. Keywords: Bradley-Terry model, Probability estimates, Error correcting output codes, Support Vector Machines
Efficient pairwise classification
- ECML 2007. LNCS (LNAI
, 2007
"... Abstract. Pairwise classification is a class binarization procedure that converts a multi-class problem into a series of two-class problems, one problem for each pair of classes. While it can be shown that for training, this procedure is more efficient than the more commonly used oneagainst-all appr ..."
Abstract
-
Cited by 15 (8 self)
- Add to MetaCart
Abstract. Pairwise classification is a class binarization procedure that converts a multi-class problem into a series of two-class problems, one problem for each pair of classes. While it can be shown that for training, this procedure is more efficient than the more commonly used oneagainst-all approach, it still has to evaluate a quadratic number of classifiers when computing the predicted class for a given example. In this paper, we propose a method that allows a faster computation of the predicted class when weighted or unweighted voting are used for combining the predictions of the individual classifiers. While its worst-case complexity is still quadratic in the number of classes, we show that even in the case of completely random base classifiers, our method still outperforms the conventional pairwise classifier. For the more practical case of well-trained base classifiers, its asymptotic computational complexity seems to be almost linear. 1
Mining Statistically Important Equivalence Classes and Delta-Discriminative Emerging Patterns
, 2007
"... The support-confidence framework is the most common measure used in itemset mining algorithms, for its antimonotonicity that effectively simplifies the search lattice. This computational convenience brings both quality and statistical flaws to the results as observed by many previous studies. In thi ..."
Abstract
-
Cited by 13 (1 self)
- Add to MetaCart
The support-confidence framework is the most common measure used in itemset mining algorithms, for its antimonotonicity that effectively simplifies the search lattice. This computational convenience brings both quality and statistical flaws to the results as observed by many previous studies. In this paper, we introduce a novel algorithm that produces itemsets with ranked statistical merits under sophisticated test statistics such as chi-square, risk ratio, odds ratio, etc. Our algorithm is based on the concept of equivalence classes. An equivalence class is a set of frequent itemsets that always occur together in the same set of transactions. Therefore, itemsets within an equivalence class all share the same level of statistical significance regardless of the variety of test statistics. As an equivalence class can be uniquely determined and concisely represented by a closed pattern and a set of generators, we just mine closed patterns and generators, taking a simultaneous depth-first search scheme. This parallel approach has not been exploited by any prior work. We evaluate our algorithm on two aspects. In general, we compare to LCM and FPclose which are the best algorithms tailored for mining only closed patterns. In particular, we compare to epMiner which is the most recent algorithm for mining a type of relative risk patterns, known as minimal emerging patterns. Experimental results show that our algorithm is faster than all of them, sometimes even multiple orders of magnitude faster. These statistically ranked patterns and the efficiency have a high potential for reallife applications, especially in biomedical and financial fields where classical test statistics are of dominant interest.
Discriminative classifiers with adaptive kernels for noise robust speech recognition
- Comput. Speech Lang
, 2010
"... Discriminative classifiers are a popular approach to solving classification problems. However one of the problems with these approaches, in particular kernel based classifiers such as Support Vector Machines (SVMs), is that they are hard to adapt to mismatches between the training and test data. Thi ..."
Abstract
-
Cited by 12 (10 self)
- Add to MetaCart
Discriminative classifiers are a popular approach to solving classification problems. However one of the problems with these approaches, in particular kernel based classifiers such as Support Vector Machines (SVMs), is that they are hard to adapt to mismatches between the training and test data. This paper describes a scheme for overcoming this problem for speech recognition in noise by adapting the kernel rather than the SVM decision boundary. Generative kernels, defined using generative models, are one type of kernel that allows SVMs to handle sequence data. By compensating the parameters of the generative models for each noise condition noise-specific generative kernels can be obtained. These can be used to train a noiseindependent SVM on a range of noise conditions, which can then be used with a test-set noise kernel for classification. The noise-specific kernels used in this paper are based on Vector Taylor Series (VTS) model-based compensation. VTS allows all the model parameters to be compensated and the background noise to be estimated in a maximum likelihood fashion. A brief discussion of VTS, and the optimisation of the mismatch function representing the impact of noise on the clean speech, is also included. Experiments using these VTS-based test-set noise kernels were run on the AURORA 2 continuous digit task. The proposed SVM rescoring scheme yields large gains in performance over the VTS compensated models. Key words: speech recognition, noise robustness, support vector machines, generative kernels
Comparison of Ranking Procedures in Pairwise Preference Learning
- In Proceedings of the 10th International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU-04
, 2004
"... Computational methods for discovering the preferences of individuals are useful in many applications. In this paper, we propose a method for learning valued preference structures, using a natural extension of so-called pairwise classification. A valued preference structure can then be used i ..."
Abstract
-
Cited by 10 (9 self)
- Add to MetaCart
Computational methods for discovering the preferences of individuals are useful in many applications. In this paper, we propose a method for learning valued preference structures, using a natural extension of so-called pairwise classification. A valued preference structure can then be used in order to induce a ranking, that is a linear ordering of a given set of alternatives. This step is realized by means of a so-called ranking procedure. In the second part of the paper, we compare the performance of alternative ranking procedures in an experimental way.

