Results 1 - 10
of
33
On The Use Of Support Vector Machines For Phonetic Classification
- in ICASSP99
, 1999
"... Support Vector Machines (SVMs) represent a new approach to pattern classification which has recently attracted a great deal of interest in the machine learning community. Their appeal lies in their strong connection to the underlying statistical learning theory, in particular the theory of Structura ..."
Abstract
-
Cited by 58 (1 self)
- Add to MetaCart
Support Vector Machines (SVMs) represent a new approach to pattern classification which has recently attracted a great deal of interest in the machine learning community. Their appeal lies in their strong connection to the underlying statistical learning theory, in particular the theory of Structural Risk Minimization. SVMs have been shown to be particularly successful in fields such as image identification and face recognition; in many problems SVM classifiers have been shown to perform much better than other nonlinear classifiers such as artificial neural networks and k-nearest neighbors. This paper explores the issues involved in applying SVMs to phonetic classification as a first step to speech recognition. We present results on several standard vowel and phonetic classification tasks and show better performance than Gaussian mixture classifiers. We also present an analysis of the difficulties we foresee in applying SVMs to continuous speech recognition problems. 1. INTRODUCTION ...
Large margin hidden markov models for speech recognition
, 2005
"... In this work, motivated by large margin classifiers in machine learning, we propose a novel method to estimate continuous density hidden Markov model (CDHMM) for speech recognition according to the principle of maximizing the minimum muti-class separation margin. The approach is named as large margi ..."
Abstract
-
Cited by 12 (1 self)
- Add to MetaCart
In this work, motivated by large margin classifiers in machine learning, we propose a novel method to estimate continuous density hidden Markov model (CDHMM) for speech recognition according to the principle of maximizing the minimum muti-class separation margin. The approach is named as large margin HMM. Firstly, we show this type of large margin HMM estimation problem can be formulated as a constrained minimax optimization problem. Secondly, by imposing different constraints to the minimax problem, we propose three solutions to the large margin HMM estimation problem, namely the iterative localized optimization method, the constrained joint optimization method and the semidefinite pro-gramming (SDP) method. These new training methods are evaluated in the isolated E-set recognition task using ISOLET database and the TIDIGITS connected digit string recog-nition task. Experimental results clearly show that the large margin HMMs consistently outperform the conventional HMM training methods. It has been consistently observed that the large margin training method yields significant recognition error rate reduction even on top of some popular discriminative training methods.
Maximum margin training of generative kernels
, 2004
"... Generative kernels, a generalised form of Fisher kernels, are a powerful form of kernel that allow the kernel parameters to be tuned to a specific task. The standard approach to training these kernels is to use maximum likelihood estimation. This paper describes a novel approach based on maximum-mar ..."
Abstract
-
Cited by 10 (4 self)
- Add to MetaCart
Generative kernels, a generalised form of Fisher kernels, are a powerful form of kernel that allow the kernel parameters to be tuned to a specific task. The standard approach to training these kernels is to use maximum likelihood estimation. This paper describes a novel approach based on maximum-margin training of both the kernel parameters and a Support Vector Machine (SVM) classifier. It combines standard SVM training with a gradient-descent based kernel parameter optimisation scheme. This allows the kernel parameters to be explicitly trained for the data set and the SVM score-space. Initial results on an artificial task and the Deterding data show that such an approach can reduce classification error rates. 1 1
Speech recognition with support vector machines in a hybrid system
- in Proc. EuroSpeech, 2005
, 2005
"... While the temporal dynamics of speech can be represented very efficiently by Hidden Markov Models (HMMs), the classification of speech into single speech units (phonemes) is usually done with Gaussian mixture models which do not discriminate well. Here, we use Support Vector Machines (SVMs) for clas ..."
Abstract
-
Cited by 10 (6 self)
- Add to MetaCart
While the temporal dynamics of speech can be represented very efficiently by Hidden Markov Models (HMMs), the classification of speech into single speech units (phonemes) is usually done with Gaussian mixture models which do not discriminate well. Here, we use Support Vector Machines (SVMs) for classification by integrating this method in a HMM-based speech recognition system. In this hybrid SVM/HMM system we translate the outputs of the SVM classifiers into conditional probabilities and use them as emission probabilities in a HMM-based decoder. SVMs are very appealing due to their association with statistical learning theory. They have already shown very good classification results in other fields of pattern recognition. We train and test our hybrid system on the DARPA Resource Management (RM1) corpus. Our results show better performance than HMM-based decoder using Gaussian mixtures. 1.
Support Vector Machines for Phoneme Classification
, 2001
"... In this thesis, Support Vector Machines (SVMs) are applied to the problem of phoneme classification. Given a sequence of acoustic observations and 40 phoneme targets, the task is to classify each observation to one of these targets. Since this task involves multiple classes, one of the main hurdles ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
In this thesis, Support Vector Machines (SVMs) are applied to the problem of phoneme classification. Given a sequence of acoustic observations and 40 phoneme targets, the task is to classify each observation to one of these targets. Since this task involves multiple classes, one of the main hurdles SVMs must overcome is to extend the inherently binary SVMs to the multi-class case. To do this, several methods are proposed, and their generalisation abilities are measured. It is found that even though some generalisation is lost in the transition, this can still lead to effective classifiers. In addition, a refinement to the SVMs is made to derive estimated posterior probabilities from classifications. Since almost all speech recognition systems are based on statistical models, this is necessary if SVMs are to be used in a full speech recognition system. The best accuracy found was 71.4%, which is competitive with the best results found in literature.
Kernel-Based Feature Extraction with a Speech Technology Application
"... Kernel-based nonlinear feature extraction and classification algorithms are a popular new research direction in machine learning. This paper examines their applicability to the classification of phonemes in a phonological awareness drilling software package. We first give a concise overview of the ..."
Abstract
-
Cited by 5 (3 self)
- Add to MetaCart
Kernel-based nonlinear feature extraction and classification algorithms are a popular new research direction in machine learning. This paper examines their applicability to the classification of phonemes in a phonological awareness drilling software package. We first give a concise overview of the nonlinear feature extraction methods such as kernel principal component analysis (KPCA), kernel independent component analysis (KICA), kernel linear discriminant analysis (KLDA) and kernel springy discriminant analysis (KSDA). The overview deals with all the methods in a unified framework, regardless of whether they are unsupervised or supervised. The effect of the transformations on a subsequent classification is tested in combination with learning algorithms such as Gaussian mixture modeling (GMM), artificial neural nets (ANN), projection pursuit learning (PPL), decision tree-based classification (C4.5) and support vector machines (SVM). We found in most cases that the transformations have a beneficial effect on the classification performance. Furthermore, the nonlinear supervised algorithms yielded the best results.
Using one-class svms and wavelets for audio surveillance systems. submitted to IEEE trans. on Information Forensic and Security
"... This paper presents a procedure aimed at recognizing environmental sounds for surveillance and security applications. We propose to apply One-Class Support Vector Machines (1-SVMs) together with a sophisticated dissimilarity measure as a discriminative framework in order to address audio classificat ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper presents a procedure aimed at recognizing environmental sounds for surveillance and security applications. We propose to apply One-Class Support Vector Machines (1-SVMs) together with a sophisticated dissimilarity measure as a discriminative framework in order to address audio classification, and hence, sound recognition. We illustrate the performance of this method on an audio database, which consists of above 1,000 sounds belonging to 9 classes. Additionally, the use of a set of state-of-the-art audio features is studied. Additionally, we introduce a set of novel features obtained by combining elementary features. Experimental results are presented and show the superiority of this novel sound recognition method. We show that the 1-SVM clearly overperforms the conventional HMM-based system and we emphasize that the largest improvement is achieved when the system is fed by a set of features that comprises wavelet coefficients.
Support vector machines vs multi-layer perceptrons in particle identification
- In Proceedings of the European Symposium on Artifical Neural Networks '99 (D-Facto
, 1999
"... Abstract. In this paper we evaluate the performance of Support Vector Machines (SVMs) and Multi-Layer Perceptrons (MLPs) on two di erent problems of Particle Identi cation in High Energy Physics experiments. The obtained results indicate that SVMs and MLPs tend to perform very similarly. 1. ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract. In this paper we evaluate the performance of Support Vector Machines (SVMs) and Multi-Layer Perceptrons (MLPs) on two di erent problems of Particle Identi cation in High Energy Physics experiments. The obtained results indicate that SVMs and MLPs tend to perform very similarly. 1.
Combined Binary Classifiers With Applications To Speech Recognition
- NEAREST-NEIGHBOR ECOC WITH APPLICATION TO ALL-PAIRS MULTICLASS SVM
, 2002
"... Many applications require classification of examples into one of several classes. A common way of designing such classifiers is to determine the class based on the outputs of several binary classifiers. We consider some of the most popular methods for combining the decisions of the binary classifier ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Many applications require classification of examples into one of several classes. A common way of designing such classifiers is to determine the class based on the outputs of several binary classifiers. We consider some of the most popular methods for combining the decisions of the binary classifiers, and improve existing bounds on the error rates of the combined classifier over the training set. We also describe a new method for combining binary classifiers. The method is based on stacking a neural network and, when used with support vector machines as the binary learners, substantially decreased the error rate in two vowel classification tasks.
Speech Recognition Using Acoustic Landmarks and Binary Phonetic Feature Classifiers
, 2003
"... In spite of decades of research, Automatic Speech Recognition (ASR) is far from reaching the goal of performance close to Human Speech Recognition (HSR). One of the reasons for unsatisfactory performance of the state-of-the-art ASR systems, that are based largely on Hidden Markov Models (HMMs), i ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In spite of decades of research, Automatic Speech Recognition (ASR) is far from reaching the goal of performance close to Human Speech Recognition (HSR). One of the reasons for unsatisfactory performance of the state-of-the-art ASR systems, that are based largely on Hidden Markov Models (HMMs), is the inferior acoustic modeling of low level or phonetic level linguistic information in the speech signal. An acoustic-phonetic approach to ASR, on the other hand, explicitly targets linguistic information in the speech signal. But an acoustic phonetic system that carries out large ASR speech recognition tasks, for example, connected word or continuous speech recognition, does not exist. We propose a probabilistic and statistical framework for ASR based on the knowledge of acoustic phonetics for connected word ASR. The proposed system is based on the idea of representation of speech sounds by bundles of binary valued articulatory phonetic features. The probabilistic framework requires only binary classifiers of phonetic features and the knowledge based acoustic correlates of the features for the purpose of connected word speech recognition. We explore the use of Support Vector Machines (SVMs) for binary phonetic feature classification because of the favorable properties well suited to our recognition task that SVMs o#er. In the proposed method, probabilistic segmentation of speech is obtained using SVM based classifiers of manner phonetic features. The linguistically motivated landmarks obtained in each segmentation is used for classification of source and place phonetic features. Probabilistic segmentation paths are constrained using Finite State Automata (FSA) for isolated or connected word recognition. The proposed method could overcome the disadvantages ...

