• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Using Self-Organizing Maps and Learning Vector Quantization for Mixture Density Hidden Markov Models (1997)

by M Kurimo
Venue:Helsinki University of Technology
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 12
Next 10 →

Unlimited vocabulary speech recognition based on morphs discovered in an unsupervised manner

by Vesa Siivola, Teemu Hirsimäki, Mathias Creutz, Mikko Kurimo - in Proc. Eurospeech , 2003
"... We study continuous speech recognition based on sub-word units found in an unsupervised fashion. For agglutinative languages like Finnish, traditional word-based n-gram language modeling does not work well due to the huge number of different word forms. We use a method based on the Minimum Descripti ..."
Abstract - Cited by 43 (20 self) - Add to MetaCart
We study continuous speech recognition based on sub-word units found in an unsupervised fashion. For agglutinative languages like Finnish, traditional word-based n-gram language modeling does not work well due to the huge number of different word forms. We use a method based on the Minimum Description Length principle to split words statistically into subword units allowing efficient language modeling and unlimited vocabulary. The perplexity and speech recognition experiments on Finnish speech data show that the resulting model outperforms both word and syllable based trigram models. Compared to the word trigram model, the out-of-vocabulary rate is reduced from 20 % to 0 % and the word error rate from 56 % to 32%. 1.

Inferring relevance from eye movements: Feature extraction

by Jarkko Salojärvi, Kai Puolamäki, Jaana Simola, Lauri Kovanen, Ilpo Kojo, Samuel Kaski - Helsinki University of Technology , 2005
"... We organize a PASCAL EU Network of Excellence challenge for inferring relevance from eye movements, beginning 1 March 2005. The aim of this paper is to provide background material for the competitors: give references to related articles on eye movement modelling, describe the methods used for extrac ..."
Abstract - Cited by 9 (3 self) - Add to MetaCart
We organize a PASCAL EU Network of Excellence challenge for inferring relevance from eye movements, beginning 1 March 2005. The aim of this paper is to provide background material for the competitors: give references to related articles on eye movement modelling, describe the methods used for extracting the features used in the challenge, provide results of basic reference methods and to discuss open questions in the field. 1

Relevance feedback from eye movements for proactive information retrieval

by Jarkko Salojärvi, Kai Puolamäki, Samuel Kaski - WORKSHOP ON PROCESSING SENSORY INFORMATION FOR PROACTIVE SYSTEMS (PSIPS 2004 , 2004
"... We study whether it is possible to infer from eye movements measured during reading what is relevant for the user in an information retrieval task. Inference is made using hidden Markov and discriminative hidden Markov models. The result of this feasibility study is that prediction of relevance is p ..."
Abstract - Cited by 8 (6 self) - Add to MetaCart
We study whether it is possible to infer from eye movements measured during reading what is relevant for the user in an information retrieval task. Inference is made using hidden Markov and discriminative hidden Markov models. The result of this feasibility study is that prediction of relevance is possible to a certain extent, and models benefit from taking into account the time series nature of the data.

Learning vector quantization algorithm for probabilistic models

by Jaakko Hollmén, Volker Tresp, Olli Simula - In Proceedings of EUSIPCO 2000 — X European Signal Processing Conference, volume II , 2000
"... In classification problems, it is preferred to attack the discrimination problem directly rather than indirectly by first estimating the class densities and by then estimating the discrimination function from the generative models through Bayes’s rule. Sometimes, however, it is convenient to express ..."
Abstract - Cited by 4 (2 self) - Add to MetaCart
In classification problems, it is preferred to attack the discrimination problem directly rather than indirectly by first estimating the class densities and by then estimating the discrimination function from the generative models through Bayes’s rule. Sometimes, however, it is convenient to express the models as probabilistic models, since they are generative in nature and can handle the representation of high-dimensional data like time-series. In this paper, we derive a discriminative training procedure based on Learning Vector Quantization (LVQ) where the codebook is expressed in terms of probabilistic models. The likelihood-based distance measure is justified using the Kullback-Leibler distance. In updating the winner unit, a gradient learning step is taken with regard to the parameters of the probabilistic model. The method essentially departs from a prototypical representation and incorporates learning in the parameter space of generative models. As an illustration, we present experiments in the fraud detection domain, where models of calling behavior are used to classify mobile phone subscribers to normal and fraudulent users. This is an extension of our earlier work in clustering probabilistic models with the Self-Organizing Map (SOM) algorithm to the classification domain. 1

Competing hidden markov models on the self-organizing map

by Panu Somervuo - Piscataway, NJ. Helsinki Univ of Technology, IEEE
"... This paper presents an unsupervised segmentation method for feature sequences based on competitivelearning hidden Markov models. Models associated with the nodes of the Self-Organizing Map learn to become selective to the segments of temporal input sequences. Input sequences may have arbitrary lengt ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
This paper presents an unsupervised segmentation method for feature sequences based on competitivelearning hidden Markov models. Models associated with the nodes of the Self-Organizing Map learn to become selective to the segments of temporal input sequences. Input sequences may have arbitrary lengths. Segment models emerge then on the map through an unsupervised learning process. The method was tested in speech recognition, where the performance of the emergent segment models was as good as the performance of the traditionally used linguistic speech segment models. The benefits of the proposed method are the use of unsupervised learning for obtaining the state models for temporal data and the convenient visualization of the state space on the two-dimensional map. 1.

Using SOM in Data Mining

by Juha Vesanto , 2000
"... ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
Abstract not found

Organizers

by Kai Puolamäki, Samuel Kaski, Samy Bengio Idiap, Helene Hembrooke Cornell, Thorsten Joachims Cornell, Samuel Kaski, Kai Puolamäki , 2005
"... ISBN 951-22-8219-4 (printed version) ISBN 951-22-8220-8 (electronic version) ..."
Abstract - Cited by 2 (0 self) - Add to MetaCart
ISBN 951-22-8219-4 (printed version) ISBN 951-22-8220-8 (electronic version)

Variations on Statistical Phoneme Recognition -- A Hybrid Approach

by Rudolph van der Merwe , 1997
"... Automatic speech recognition (ASR) is rapidly becoming a mature technology leading to an increasing number of commercial applications. Although great advances have been made in the state of the art of speech recognition over the last 10 years, the holy grail of ASR, namely large vocabulary speaker ..."
Abstract - Add to MetaCart
Automatic speech recognition (ASR) is rapidly becoming a mature technology leading to an increasing number of commercial applications. Although great advances have been made in the state of the art of speech recognition over the last 10 years, the holy grail of ASR, namely large vocabulary speaker independent continuous speech recognition with an error rate of less than 1%, still eludes researchers. At the heart of most modern speech recognition systems lies a HMM based phoneme recognition engine which segments and classifies the incoming acoustic signal into a sequence of phonemes. These phonemes are concatenated to form word models which are processed further to arrive at a transcription of the linguistic message encoded in the speech signal. The final recognition accuracy of the speech recognition system can thus be directly linked to the recognition accuracy of the underlying phoneme recogniser. Two types of features extracted from the speech signal is commonly used for phoneme recognition. These are the supra-segmental knowledge-based features derived from phonetic and phonologic theory, and the widely used frame-based cepstral features. Up till now, these features have been used separately by researchers, resulting in the loss of valuable discriminative information.

Indexing Audio Documents by using Latent Semantic Analysis and SOM

by Mikko Kurimo Idiap, Mikko Kurimo - Kohonen Maps , 1999
"... This paper describes an important application for state-of-art automatic speech recognition, natural language processing and information retrieval systems. Methods for enhancing the indexing of spoken documents by using latent semantic analysis and self-organizing maps are presented, motivated and t ..."
Abstract - Add to MetaCart
This paper describes an important application for state-of-art automatic speech recognition, natural language processing and information retrieval systems. Methods for enhancing the indexing of spoken documents by using latent semantic analysis and self-organizing maps are presented, motivated and tested. The idea is to extract extra information from the structure of the document collection and use it for more accurate indexing by generating new index terms and stochastic index weights. Indexing methods are evaluated for two broadcast news databases (one French and one English) using the average document perplexity dened in this paper and test queries analyzed by human experts.

Acoustic Modeling

by The General Goal
"... gment models on the SOM. Each map node is associated with an HMM (with three states in this case) instead of a traditionally used single feature vector. The thick line represents the Viterbi segmentation of one input sequence. This corresponds to the best matching unit (BMU) search. The models of th ..."
Abstract - Add to MetaCart
gment models on the SOM. Each map node is associated with an HMM (with three states in this case) instead of a traditionally used single feature vector. The thick line represents the Viterbi segmentation of one input sequence. This corresponds to the best matching unit (BMU) search. The models of the BMUs and neighboring units are then updated by the corresponding segments. The block diagram of a speech recognition system is shown in Figure 2. The recognition is based on connecting the hidden Markov models (HMMs) of the phonemes to decode the phoneme sequences of the spoken utterances. The output density function of each state in each model is a mixture of multivariate Gaussian densities. We have used the following scheme for the training of the models [1]. SOM is used rst for initializing the phoneme-wise codebooks. Each model vector becomes then a mean vector of a 1 Gaussian kernel. After initialization, the training is continued by segmental-SOM or Kmeans algorithm. Segmental-LVQ
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University