Results 1 - 10
of
19
Cost curves: an improved method for visualizing classifier performance
- Machine Learning
, 2006
"... Abstract. This paper introduces cost curves, a graphical technique for visualizing the performance (error rate or expected cost) of 2-class classifiers over the full range of possible class distributions and misclassification costs. Cost curves are shown to be superior to ROC curves for visualizing ..."
Abstract
-
Cited by 27 (5 self)
- Add to MetaCart
Abstract. This paper introduces cost curves, a graphical technique for visualizing the performance (error rate or expected cost) of 2-class classifiers over the full range of possible class distributions and misclassification costs. Cost curves are shown to be superior to ROC curves for visualizing classifier performance for most purposes. This is because they visually support several crucial types of performance assessment that cannot be done easily with ROC curves, such as showing confidence intervals on a classifier’s performance, and visualizing the statistical significance of the difference in performance of two classifiers. A software tool supporting all the cost curve analysis described in this paper is available from the authors.
Discriminative keyword spotting
- In Proc. of Workshop on Non-Linear Speech Processsing
, 2007
"... This paper proposes a new approach for keyword spotting, which is not based on HMMs. The proposed method employs a new discriminative learning procedure, in which the learning phase aims at maximizing the area under the ROC curve, ..."
Abstract
-
Cited by 8 (6 self)
- Add to MetaCart
This paper proposes a new approach for keyword spotting, which is not based on HMMs. The proposed method employs a new discriminative learning procedure, in which the learning phase aims at maximizing the area under the ROC curve,
Threshold Selection for Unsupervised Detection, with an Application to Microphone Arrays
, 2005
"... Detection is usually done by comparing some criterion to a threshold. It is often desirable to keep a performance metric such as False Alarm Rate constant across conditions. Using training data to select the threshold may lead to suboptimal results on test data recorded in different conditions. This ..."
Abstract
-
Cited by 5 (5 self)
- Add to MetaCart
Detection is usually done by comparing some criterion to a threshold. It is often desirable to keep a performance metric such as False Alarm Rate constant across conditions. Using training data to select the threshold may lead to suboptimal results on test data recorded in different conditions. This paper investigates unsupervised approaches, where no training data is used. A probabilistic model is fitted on the test data using the EM algorithm, and the threshold value is selected based on the model. The proposed approach (1) does not use training data, (2) uses the test data itself to compensate for simplifications inherent to the model, (3) permits the use of more complex models in a straightforward manner. On a microphone array speech detection task, the proposed unsupervised approach achieves similar or better results than the “training ” approach. The methodology is general and may be applied to other contexts than microphone arrays, and other performance metrics than FAR. 1.
A Kernel Trick For Sequences Applied to Text-Independent Speaker Verification Systems Abstract
"... This paper present a principled SVM based speaker verification system. We propose a new framework and a new sequence kernel that can make use of any Mercer kernel at the frame level. An extension of the sequence kernel based on the Max operator is also proposed. The new system is compared to state-o ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This paper present a principled SVM based speaker verification system. We propose a new framework and a new sequence kernel that can make use of any Mercer kernel at the frame level. An extension of the sequence kernel based on the Max operator is also proposed. The new system is compared to state-of-the-art GMM and other SVM based systems found in the literature on the Banca and Polyvar databases. The new system outperforms, most of the time, the other systems, statistically significantly. Finally, the new proposed framework clarifies previous SVM based systems and suggests interesting future research directions.
New algorithms for optimizing multi-class classifiers via ROC surfaces
- In Proceedings of the ICML 2006 Workshop on ROC Analysis in Machine Learning
, 2006
"... We study the problem of optimizing a multiclass classifier based on its ROC hypersurface and a matrix describing the costs of each type of prediction error. For a binary classifier, it is straightforward to find an optimal operating point based on its ROC curve and the relative cost of true positive ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
We study the problem of optimizing a multiclass classifier based on its ROC hypersurface and a matrix describing the costs of each type of prediction error. For a binary classifier, it is straightforward to find an optimal operating point based on its ROC curve and the relative cost of true positive to false positive error. However, the corresponding multiclass problem (finding an optimal operating point based on a ROC hypersurface and cost matrix) is more challenging. We present several heuristics for this problem, including linear and nonlinear programming formulations, genetic algorithms, and a customized algorithm. Empirical results suggest that genetic algorithms fare the best overall, improving performance most often. 1.
Short-Term Spatio–Temporal Clustering Applied to Multiple Moving Speakers
"... Abstract—Distant microphones permit to process spontaneous multiparty speech with very little constraints on speakers, as opposed to close-talking microphones. Minimizing the constraints on speakers permits a large diversity of applications, including meeting summarization and browsing, surveillance ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Abstract—Distant microphones permit to process spontaneous multiparty speech with very little constraints on speakers, as opposed to close-talking microphones. Minimizing the constraints on speakers permits a large diversity of applications, including meeting summarization and browsing, surveillance, hearing aids, and more natural human–machine interaction. Such applications of distant microphones require to determine where and when the speakers are talking. This is inherently a multisource problem, because of background noise sources, as well as the natural tendency of multiple speakers to talk over each other. Moreover, spontaneous speech utterances are highly discontinuous, which makes it difficult to track the multiple speakers with classical filtering approaches, such as Kalman filtering of particle filters. As an alternative, this paper proposes a probabilistic framework to determine the trajectories of multiple moving speakers in the short-term only, i.e., only while they speak. Instantaneous location estimates that are close in space and time are grouped into “short-term clusters ” in a principled manner. Each short-term cluster determines the precise start and end times of an utterance and a short-term spatial trajectory. Contrastive experiments clearly show the benefit of using short-term clustering, on real indoor recordings with seated speakers in meetings, as well as multiple moving speakers. Index Terms—Localization, multiple acoustic sources, short-term clustering, speech segmentation, tracking. I.
Spectral Subband Centroids as Complementary Features for Speaker Authentication
, 2004
"... Most conventional features used in speaker authentication are based on estimation of spectral envelopes in one way or another, e.g., Mel-scale Filterbank Cepstrum Coefficients (MFCCs), Linear-scale Filterbank Cepstrum Coefficients (LFCCs) and Relative Spectral Perceptual Linear Prediction (RASTA ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Most conventional features used in speaker authentication are based on estimation of spectral envelopes in one way or another, e.g., Mel-scale Filterbank Cepstrum Coefficients (MFCCs), Linear-scale Filterbank Cepstrum Coefficients (LFCCs) and Relative Spectral Perceptual Linear Prediction (RASTA-PLP). In this study, Spectral Subband Centroids (SSCs) are examined. These features are the centroid frequency in each subband. They have properties similar to formant frequencies but are limited to a given subband. Empirical experiments carried out on the NIST2001 database using SSCs, MFCCs, LFCCs and their combinations by concatenation suggest that SSCs are somewhat more robust compared to conventional MFCC and LFCC features as well as being partially complementary.
Score Fusion by Maximizing the Area under the ROC Curve ⋆
"... Abstract. Information fusion is currently a very active research topic aimed at improving the performance of biometric systems. This paper proposes a novel method for optimizing the parameters of a score fusion model based on maximizing an index related to the Area Under the ROC Curve. This approach ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract. Information fusion is currently a very active research topic aimed at improving the performance of biometric systems. This paper proposes a novel method for optimizing the parameters of a score fusion model based on maximizing an index related to the Area Under the ROC Curve. This approach has the convenience that the fusion parameters are learned without having to specify the client and impostor priors or the costs for the different errors. Empirical results on several datasets show the effectiveness of the proposed approach. 1
Cost-minimising strategies for data labelling: optimal stopping and active learning
, 2007
"... Supervised learning deals with the inference of a distribution over an output or label space Y conditioned on points in an observation space X, given a training dataset D of pairs in X × Y. However, in a lot of applications of interest, acquisition of large amounts of observations is easy, while the ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Supervised learning deals with the inference of a distribution over an output or label space Y conditioned on points in an observation space X, given a training dataset D of pairs in X × Y. However, in a lot of applications of interest, acquisition of large amounts of observations is easy, while the process of generating labels is time-consuming or costly. One way to deal with this problem is active learning, where points to be labelled are selected with the aim of creating a model with better performance than that of an model trained on an equal number of randomly sampled points. In this paper, we instead propose to deal with the labelling cost directly: The learning goal is defined as the minimisation of a cost which is a function of the expected model performance and the total cost of the labels used. This allows the development of general strategies and specific algorithms for (a) optimal stopping, where the expected cost dictates whether label acquisition should continue (b) empirical evaluation, where the cost is used as a performance metric for a given combination of inference, stopping and sampling methods. Though the main focus of the paper is optimal stopping, we also aim to provide the background for further developments and discussion in the related field of active learning. 1

