• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

S.: A kernel trick for sequences applied to text-independent speaker verification systems (2007)

by J Mariéthoz, Bengio
Venue:Pattern Recognition
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 15
Next 10 →

ALIZE/SpkDet: a state-of-the-art open source software for speaker recognition

by Jean-françois Bonastre, Nicolas Scheffer, Driss Matrouf, Corinne Fredouille, Anthony Larcher, Re Preti, Gilles Pouchoulin, Nicholas Evans, Benoît Fauve, John Mason
"... This paper presents the ALIZE/SpkDet open source software packages for text independent speaker recognition. This software is based on the well-known UBM/GMM approach. It includes also the latest speaker recognition developments such as Latent Factor Analysis (LFA) and unsupervised adaptation. Discr ..."
Abstract - Cited by 24 (6 self) - Add to MetaCart
This paper presents the ALIZE/SpkDet open source software packages for text independent speaker recognition. This software is based on the well-known UBM/GMM approach. It includes also the latest speaker recognition developments such as Latent Factor Analysis (LFA) and unsupervised adaptation. Discriminant classifiers such as SVM supervectors are also provided, linked with the Nuisance Attribute Projection (NAP). The software performance is demonstrated within the framework of the NIST’06 SRE evaluation campaign. Several other applications like speaker diarization, embedded speaker recognition, password dependent speaker recognition and pathological voice assessment are also presented. 1.
(Show Context)

Citation Context

...ori knowledge (world) and the current stat, copy it into client model. 2.2.5. Discriminant classifiers Discriminant classifiers like the SVM were proposed during the past years in several works as in =-=[10, 11, 12, 13]-=-. These classifiers are usually applied to GMM supervectors. A GMM supervector is composed of the means of a classical GMM system, as initially proposed by [13]. Libsvm library 3 is used for the basic...

State-of-theArt Performance in Text-Independent Speaker Verification through Open-Source Software

by Benoît G. B. Fauve, Driss Matrouf, Nicolas Scheffer, Jean-françois Bonastre, Senior Member, John S. D. Mason - IEEE Transactions on Audio, Speech and Language Processing , 2007
"... Abstract—This paper illustrates an evolution in state-of-the-art speaker verification by highlighting the contribution from newly developed techniques. Starting from a baseline system based on Gaussian mixture models that reached state-of-the-art performances during the NIST’04 SRE, final systems wi ..."
Abstract - Cited by 21 (3 self) - Add to MetaCart
Abstract—This paper illustrates an evolution in state-of-the-art speaker verification by highlighting the contribution from newly developed techniques. Starting from a baseline system based on Gaussian mixture models that reached state-of-the-art performances during the NIST’04 SRE, final systems with new intersession compensation techniques show a relative gain of around 50%. This work highlights that a key element in recent improvements is still the classical maximum a posteriori (MAP) adaptation, while the latest compensation methods have a crucial impact on overall performances. Nuisance attribute projection (NAP) and factor analysis (FA) are examined and shown to provide significant improvements. For FA, a new symmetrical scoring (SFA) approach is proposed. We also show further improvement with an original combination between a support vector machine and SFA. This work is undertaken through the open-source ALIZE toolkit. Index Terms—Channel compensation, factor analysis, nuisance attribute projection, speaker verification. I.
(Show Context)

Citation Context

...AND LINEAR SVM A. General Framework The SVM approach offers an alternative classification strategy to the widely used GMM and has been investigated by many in the context of ASV; see for example [4], =-=[16]-=-–[18]. Here, the LibSVM library3 has been used to integrate SVM functionalities into ALIZE. This accommodates sequence kernels defined as where is a high-dimensional vector representation of sentence ...

Kernel Methods for Text-Independent Speaker Verification

by Chris Longworth , 2010
"... In recent years, systems based on support vector machines (SVMs) have become standard for speaker verification (SV) tasks. An important aspect of these systems is the dynamic kernel. These operate on sequence data and handle the dynamic nature of the speech. In this thesis a number of techniques are ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
In recent years, systems based on support vector machines (SVMs) have become standard for speaker verification (SV) tasks. An important aspect of these systems is the dynamic kernel. These operate on sequence data and handle the dynamic nature of the speech. In this thesis a number of techniques are proposed for improving dynamic kernel-based SV systems. The first contribution of this thesis is the development of alternative forms of dynamic kernel. Several popular dynamic kernels proposed for SV are based on the Kullback-Leibler divergence between Gaussian mixture models. Since this has no closed-form solution, typically a matched-pair upper bound is used instead. This places significant restrictions on the forms of model structure that may be used. In this thesis, dynamic kernels are proposed based on alternative, variational approximations to the divergence. Unlike standard approaches, these allow the use of a more flexible modelling framework. Also, using a more accurate approximation may lead to performance gains. The second contribution of this thesis is to investigate the combination of multiple systems to improve SV performance. Typically, systems are combined by fusing the output scores. For SVM classifiers, an alternative strategy is to combine at the kernel level. Recently an efficient maximum-margin scheme for learning kernel weights has been developed. In this thesis several modifications are proposed to allow this scheme to be applied to SV tasks. System combination will only lead to gains when the kernels are complementary. In this thesis it is shown that many commonly used dynamic kernels can be placed into one of two broad classes, derivative and parametric kernels. The attributes of these classes are contrasted and the conditions under which the two forms of kernel are identical are described. By avoiding these conditions gains may be obtained by combining derivative and parametric kernels. The final contribution of this thesis is to investigate the combination of dynamic kernels with traditional static kernels for vector data. Here two general combination strategies are available: static kernel functions may be defined over the dynamic feature vectors. Alternatively, a static kernel may be applied at the observation level. In general, it is not possible to explicitly train a model in the feature space associated with a static kernel. However, it is shown in this thesis that this form of kernel can be computed by using a suitable metric with approximate component posteriors. Generalised versions of standard parametric and derivative kernels, that include an observation-level static kernel, are proposed based on this approach.
(Show Context)

Citation Context

... compute. Many forms of dynamic kernel have been proposed for classification of speech sequences. An early example of this type of kernel is the generalised linear discriminant sequence (GLDS) kernel =-=[24, 142]-=-. Here, each observation vector is initially mapped into a high-dimensional space using a static kernel. A fixed dimensional set of features is then obtained by taking the mean of the expanded observa...

A generalised derivative kernel for speaker verification

by C. Longworth, M. J. F. Gales - In Proc. ICSLP , 2008
"... An important aspect of SVM-based speaker verification systems is the choice of dynamic kernel. For the GLDS kernel, a static kernel is used to map each observation into a higher order feature space. Features are then obtained by taking a simple average over all frames. Derivative kernels, such as th ..."
Abstract - Cited by 1 (1 self) - Add to MetaCart
An important aspect of SVM-based speaker verification systems is the choice of dynamic kernel. For the GLDS kernel, a static kernel is used to map each observation into a higher order feature space. Features are then obtained by taking a simple average over all frames. Derivative kernels, such as the Fisher kernel, use a generative model as a principled way of extracting a fixed set of features from each utterance. However, the model and features are defined using the original observations. Here, a dynamic kernel is described that combines these two approaches. In general, it is not possible to explicitly train a model in the feature space associated with a static kernel. However, by using a suitable metric with approximate component posteriors, this form of dynamic kernel can be computed. This kernel generalises the GLDS and derivative kernel as special cases and is also closely related to parametric kernels such as the GMMsupervector kernel. Preliminary results using this kernel are presented on the 2002 NIST SRE dataset.
(Show Context)

Citation Context

...ich the inner product, or static kernel, can be computed. One early form of dynamic kernel that was found to be effective for the SV task is the Generalised Linear Discriminant Sequence (GLDS) kernel =-=[1, 8]-=-. Under this kernel, a static kernel mapping is applied to each observation vector. A fixed dimensional set of features is then obtained by taking the sum of the expanded observations over all frames....

Verification

by Johnny Mariéthoz, Samy Bengio, Yves Grandvalet, Johnny Mariéthoz, Samy Bengio, Yves Grandvalet , 2008
"... ..."
Abstract - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...n the literature use a procedure that converts the sequences into fixed size vectors that are processed by a linear SVM. Other sequence kernels allow embeddings in infinite-dimensional feature spaces =-=[MB07]-=-. However, compared to the mainstream approach, this type of kernels is computationally too demanding for long sequences. It will not be applied here, since the NIST database contains long sequences....

Chapter 1 Cognitive Radio Network for Smart Grid

by Raghuram Ranganathan, Robert Qiu, Zhen Hu, Shujie Hou, Zhe Chen, Marbin Pazos-revilla, Nan Guo
"... Recently, Cognitive radio and Smart Grid are two areas which have received considerable research impetus. Cognitive radios are fully pro-grammable wireless devices that can sense their environment, and dy-namically adapt their transmission waveform, channel access method, spectrum use, and networkin ..."
Abstract - Add to MetaCart
Recently, Cognitive radio and Smart Grid are two areas which have received considerable research impetus. Cognitive radios are fully pro-grammable wireless devices that can sense their environment, and dy-namically adapt their transmission waveform, channel access method, spectrum use, and networking protocols. It is widely anticipated that cognitive radio technology will become a general-purpose programmable radio that will serve as a universal platform for wireless system devel-opment, much like microprocessors have served a similar role for com-putation. The salient features of the cognitive radio, namely, frequency agility, transmission speed, and range, are ideal for application to the smart grid. In this regard, a Cognitive Radio network can serve as a ro-bust and efficient communications infrastructure that can address both the current and future energy management needs of the smart grid. The Cognitive radio network can be deployed as a large scale Wireless Re-gional Area Network (WRAN) in a smart grid, to utilize the unused TV bands recently approved for use by the Federal Communications Com-mission (FCC) In addition, a Cognitive Radio network testbed for the smart grid would serve as an ideal platform to not only address various issues related to the smart grid, such as security, information flow and power flow management, etc., but also reveal more practical problems for further research. In this chapter, the novel concept of incorporating a cognitive radio network as the communications backbone for the smart grid is outlined. A brief overview of the cognitive radio is provided, including the recently proposed IEEE 802.22 standard. In particular, an overview of Cogni-tive Radio Network testbed, existing and new hardware platforms for

IEEE TRANSACTIONS ON SMART GRID 1 Cognitive Radio Network for the Smart Grid: Experimental System Architecture, Control

by Microgrid Testbed, Robert Caiming Qiu, Senior Member, Zhen Hu, Zhe Chen, Nan Guo, Senior Member, Raghuram Ranganathan, Shujie Hou, Gang Zheng
"... eb ..."
Abstract - Add to MetaCart
Abstract not found
(Show Context)

Citation Context

...9. [76] G. Wu, E. Y. Chang, and N. Panda, “Formulating distance functions via the kernel trick,” in Proc. 11th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, Chicago, IL, Aug. 2005, pp. 703–709. =-=[77]-=- J. Mariéthoz and S. Bengio, “A kernel trick for sequences applied to text-independent speaker verification systems,” Pattern Recognition, vol. 40, no. 8, pp. 2315–2324, 2007. [78] J.Wang, J. Lee, and...

1Cognitive Radio Network as Wireless Sensor Network (III): Passive Target Intrusion Detection and Experimental Demonstration

by Changchun Zhang, Zhen Hu, Terry N. Guo, R. C. Qiu, Kenneth Currie
"... for radio frequency (RF) passive target intrusion detection. Compared to a cheap WSN, the CRN based WSN is expected to deliver better results due to its strong communication functions and powerful computing ability. Issues addressed in this paper include experimental architecture, waveform design, a ..."
Abstract - Add to MetaCart
for radio frequency (RF) passive target intrusion detection. Compared to a cheap WSN, the CRN based WSN is expected to deliver better results due to its strong communication functions and powerful computing ability. Issues addressed in this paper include experimental architecture, waveform design, and machine learning algorithm for classification. In particular, passive target intrusion is experimentally demonstrated using multiple WARP platforms that serve as the cognitive/sensor nodes. In contrast to traditional localization methods relying on radio propagation properties, the technique used in this research is based on machine learning with measured data, considering complicated multipath environment and high dimensional sensing data col-lected by the CRN based WSN. Preliminary experimental results are quite encouraging, suggesting that a large-scale CRN based WSN supported by machine learning techniques has promising potential for passive target intrusion detection in harsh RF environments.
(Show Context)

Citation Context

...nvectors. PCA works well for the high-dimensional data with linear relationships, but always fails in a nonlinear scenario. PCA can be applied in the nonlinear situation by using a kernel [25], [26], =-=[27]-=-, [28], called KPCA [20]. KPCA is therefore, a kernel-based machine learning algorithm. It uses the kernel function k (which is the same as SVM) to implicitly map the original data to a feature space ...

unknown title

by Makoto Yamada, Masashi Sugiyama, Tomoko Matsui
"... In this paper, we propose a novel semi-supervised speaker identification method that can alleviate the influence of nonstationarity such as session dependent variation, the recording environment change, and physical condition/emotion. We assume that the utterance variation follows the covariate shif ..."
Abstract - Add to MetaCart
In this paper, we propose a novel semi-supervised speaker identification method that can alleviate the influence of nonstationarity such as session dependent variation, the recording environment change, and physical condition/emotion. We assume that the utterance variation follows the covariate shift model, where only the utterance sample distribution changes in the training and test phases. Our method consists of weighted versions of kernel logistic regression and crossvalidation and is theoretically shown to have the capability of alleviating the influence of covariate shift. We experimentally show through text-independent speaker identification simulations that the proposed method is promising in dealing with variations in session dependent utterance variation. Index Terms — Speaker identification, covariate shift, semi-supervised learning, kernel logistic regression, importance estimation. 1.
(Show Context)

Citation Context

...tracted a great deal of attention. Standard methods of text-independent speaker identification includes the Gaussian mixture model (GMM) [1] or kernel methods such as the support vector machine (SVM) =-=[2]-=-. In these supervised learning methods, it is implicitly assumed that training and test data follow the same distribution. However, the training and test distributions are not necessarily the same in ...

1Wireless Tomography in Noisy Environments using Machine Learning

by Cognitive Radio Insitute
"... This paper, one in a continuing series, describes a new initiative in wireless tomography. Our goal is to combine two technolo-gies: wireless communication and radio frequency (RF) tomography, for the close-in remote sensing. The hybrid system including wireless communication devices for wireless to ..."
Abstract - Add to MetaCart
This paper, one in a continuing series, describes a new initiative in wireless tomography. Our goal is to combine two technolo-gies: wireless communication and radio frequency (RF) tomography, for the close-in remote sensing. The hybrid system including wireless communication devices for wireless tomography is proposed in this paper. Noise reduction, modified standard phase reconstruction, and imaging are exploited sequentially to perform wireless tomography in noisy environments. The performance given in this paper illustrates the significance and prospect of wireless tomography. The contributions of this paper are threefold: (1) the hybrid system provides a strong and flexible infrastructure for wireless tomography; (2) machine learning, especially non-linear dimensionality reduction, is explored to execute noise reduction and combat the non-linear noise effect; (3) modified standard phase reconstruction is well achieved using the de-noised amplitude-only total fields from the simple sensors and the received accurate full-data total fields from the advanced sensors. Experimental data provided by the Institute Fresnel in Marseille, France are used to demonstrate the concept of wireless tomography and validate the corresponding algorithms. Index Terms wireless tomography, noise reduction, machine learning, phase reconstruction, imaging
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University