Results 1 
9 of
9
Probabilistic FiniteState Machines  Part I
"... Probabilistic finitestate machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translatio ..."
Abstract

Cited by 27 (1 self)
 Add to MetaCart
Probabilistic finitestate machines are used today in a variety of areas in pattern recognition, or in fields to which pattern recognition is linked: computational linguistics, machine learning, time series analysis, circuit testing, computational biology, speech recognition and machine translation are some of them. In part I of this paper we survey these generative objects and study their definitions and properties. In part II, we will study the relation of probabilistic finitestate automata with other well known devices that generate strings as hidden Markov models and ngrams, and provide theorems, algorithms and properties that represent a current state of the art of these objects.
Probabilistic DFA Inference using KullbackLeibler Divergence and Minimality
 In Seventeenth International Conference on Machine Learning
, 2000
"... Probabilistic DFA inference is the problem of inducing a stochastic regular grammar from a positive sample of an unknown language. The ALERGIA algorithm is one of the most successful approaches to this problem. In the present work we review this algorithm and explain why its generalization criterion ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
Probabilistic DFA inference is the problem of inducing a stochastic regular grammar from a positive sample of an unknown language. The ALERGIA algorithm is one of the most successful approaches to this problem. In the present work we review this algorithm and explain why its generalization criterion, a state merging operation, is purely local. This characteristic leads to the conclusion that there is no explicit way to bound the divergence between the distribution de ned by the solution and the training set distribution (that is, to control globally the generalization from the training sample). In this paper we present an alternative approach, the MDI algorithm, in which the solution is a probabilistic automaton that trades o minimal divergence from the training sample and minimal size. An e cient computation of the KullbackLeibler divergence between two probabilistic DFAs is described, from which the new learning criterion is derived. Empirical results in the d...
Alternative Approaches for Generating Bodies of Grammar Rules
, 2004
"... We compare two approaches for describing and generating bodies of rules used for natural language parsing. In today's parsers rule bodies do not exist a priori but are generated on the fly, usually with methods based on ngrams, which are one particular way of inducing probabilistic regular lan ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
We compare two approaches for describing and generating bodies of rules used for natural language parsing. In today's parsers rule bodies do not exist a priori but are generated on the fly, usually with methods based on ngrams, which are one particular way of inducing probabilistic regular languages. We compare two approaches for inducing such languages. One is based on ngrams, the other on minimization of the KullbackLeibler divergence. The inferred regular languages are used for generating bodies of rules inside a parsing procedure. We compare the two approaches along two dimensions: the quality of the probabilistic regular language they produce, and the performance of the parser they were used to build. The second approach outperforms the first one along both dimensions.
A Bernoulli mixture model for word categorisation
, 2001
"... The problem of word categorisation is formulated as one of unsupervised mixture modelling where Bernoulli distributions capture contextual information. ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
The problem of word categorisation is formulated as one of unsupervised mixture modelling where Bernoulli distributions capture contextual information.
Ten Open Problems in Grammatical Inference
, 2006
"... We propose 10 different open problems in the field of grammatical inference. In all cases, problems are theoretically oriented but correspond to practical questions. They cover the areas of polynomial learning models, learning from ordered alphabets, learning deterministic Pomdps, learning negotiati ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We propose 10 different open problems in the field of grammatical inference. In all cases, problems are theoretically oriented but correspond to practical questions. They cover the areas of polynomial learning models, learning from ordered alphabets, learning deterministic Pomdps, learning negotiation processes, learning from contextfree background knowledge.
Learning Hidden Markov Models to Fit LongTerm Dependencies
, 2005
"... this report a novel approach to the induction of the structure of Hidden Markov Models (HMMs). The notion of partially observable Markov models (POMMs) is introduced. POMMs form a particular case of HMMs where any state emits a single letter with probability one, but several states can emit the ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
this report a novel approach to the induction of the structure of Hidden Markov Models (HMMs). The notion of partially observable Markov models (POMMs) is introduced. POMMs form a particular case of HMMs where any state emits a single letter with probability one, but several states can emit the same letter. It is shown that any HMM can be represented by an equivalent POMM. The proposed induction algorithm aims at finding a POMM fitting the dynamics of the target machine, that is to best approximate the stationary distribution and the mean first passage times observed in the sample. The induction relies on nonlinear optimization and iterative state splitting from an initial order one Markov chain. Experimental results illustrate the advantages of the proposed approach as compared to BaumWelch HMM estimation or backo# smoothed Ngrams equivalent to variable order Markov chains
Links
, 2004
"... between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms P. Dupont a, ∗ , F. Denis b,Y. Esposito b ..."
Abstract
 Add to MetaCart
(Show Context)
between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms P. Dupont a, ∗ , F. Denis b,Y. Esposito b
Efficient Pruning of Probabilistic Automata 1 Franck Thollard and Baptiste Jeudy
"... Abstract. Applications of probabilistic grammatical inference are limited due to time and space consuming constraints. In statistical language modeling, for example, large corpora are now available and lead to managing automata with millions of states. We propose in this article a method for pruning ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Applications of probabilistic grammatical inference are limited due to time and space consuming constraints. In statistical language modeling, for example, large corpora are now available and lead to managing automata with millions of states. We propose in this article a method for pruning automata (when restricted to tree based structures) which is not only efficient (subquadratic) but that allows to dramatically reduce the size of the automaton with a small impact on the underlying distribution. Results are evaluated on a language modeling task. 1