Results 1  10
of
11
Tensor decompositions for learning latent variable models
, 2014
"... This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models—including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation—which exploits a certain tensor structure in their loworder observable mo ..."
Abstract

Cited by 80 (7 self)
 Add to MetaCart
(Show Context)
This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models—including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation—which exploits a certain tensor structure in their loworder observable moments (typically, of second and thirdorder). Specifically, parameter estimation is reduced to the problem of extracting a certain (orthogonal) decomposition of a symmetric tensor derived from the moments; this decomposition can be viewed as a natural generalization of the singular value decomposition for matrices. Although tensor decompositions are generally intractable to compute, the decomposition of these specially structured tensors can be efficiently obtained by a variety of approaches, including power iterations and maximization approaches (similar to the case of matrices). A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin’s perturbation theorem for the singular vectors of matrices. This implies a robust and computationally tractable estimation approach for several popular latent variable models.
Experiments with Spectral Learning of LatentVariable PCFGs
"... Latentvariable PCFGs (LPCFGs) are a highly successful model for natural language parsing. Recent work (Cohen et al., 2012) has introduced a spectral algorithm for parameter estimation of LPCFGs, which—unlike the EM algorithm—is guaranteed to give consistent parameter estimates (it has PACstyle g ..."
Abstract

Cited by 20 (8 self)
 Add to MetaCart
(Show Context)
Latentvariable PCFGs (LPCFGs) are a highly successful model for natural language parsing. Recent work (Cohen et al., 2012) has introduced a spectral algorithm for parameter estimation of LPCFGs, which—unlike the EM algorithm—is guaranteed to give consistent parameter estimates (it has PACstyle guarantees of sample complexity). This paper describes experiments using the spectral algorithm. We show that the algorithm provides models with the same accuracy as EM, but is an order of magnitude more efficient. We describe a number of key steps used to obtain this level of performance; these should be relevant to other work on the application of spectral learning algorithms. We view our results as strong empirical evidence for the viability of spectral methods as an alternative to EM. 1
Transitionbased dependency parsing with selectional branching
 In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics
, 2013
"... We present a novel approach, called selectional branching, which uses confidence estimates to decide when to employ a beam, providing the accuracy of beam search at speeds close to a greedy transitionbased dependency parsing approach. Selectional branching is guaranteed to perform a fewer number of ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
We present a novel approach, called selectional branching, which uses confidence estimates to decide when to employ a beam, providing the accuracy of beam search at speeds close to a greedy transitionbased dependency parsing approach. Selectional branching is guaranteed to perform a fewer number of transitions than beam search yet performs as accurately. We also present a new transitionbased dependency parsing algorithm that gives a complexity of O(n) for projective parsing and an expected linear time speed for nonprojective parsing. With the standard setup, our parser shows an unlabeled attachment score of 92.96% and a parsing speed of 9 milliseconds per sentence, which is faster and more accurate than the current stateoftheart transitionbased parser that uses beam search. 1
Methods of Moments for Learning Stochastic Languages: Unified Presentation and Empirical Comparison
"... Probabilistic latentvariable models are a powerful tool for modelling structured data. However, traditional expectationmaximization methods of learning such models are both computationally expensive and prone to localminima. In contrast to these traditional methods, recently developed learning a ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Probabilistic latentvariable models are a powerful tool for modelling structured data. However, traditional expectationmaximization methods of learning such models are both computationally expensive and prone to localminima. In contrast to these traditional methods, recently developed learning algorithms based upon the method of moments are both computationally efficient and provide strong statistical guarantees. In this work we provide a unified presentation and empirical comparison of three general momentbased methods in the context of modelling stochastic languages. By rephrasing these methods upon a common theoretical ground, introducing novel theoretical results where necessary, we provide a clear comparison, making explicit the statistical assumptions upon which each method relies. With this theoretical grounding, we then provide an indepth empirical analysis of the methods on both real and synthetic data with the goal of elucidating performance trends and highlighting important implementation details. 1.
Spectral learning of latentvariable PCFGs: Algorithms and sample complexity
 Journal of Machine Learning Research
, 2014
"... Abstract We introduce a spectral learning algorithm for latentvariable PCFGs ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Abstract We introduce a spectral learning algorithm for latentvariable PCFGs
A Provably Correct Learning Algorithm for LatentVariable PCFGs
"... We introduce a provably correct learning algorithm for latentvariable PCFGs. The algorithm relies on two steps: first, the use of a matrixdecomposition algorithm applied to a cooccurrence matrix estimated from the parse trees in a training sample; second, the use of EM applied to a convex object ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
We introduce a provably correct learning algorithm for latentvariable PCFGs. The algorithm relies on two steps: first, the use of a matrixdecomposition algorithm applied to a cooccurrence matrix estimated from the parse trees in a training sample; second, the use of EM applied to a convex objective derived from the training samples in combination with the output from the matrix decomposition. Experiments on parsing and a language modeling problem show that the algorithm is efficient and effective in practice. 1
Spectral Approaches to Learning Predictive Representations
, 2011
"... A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must obtain an accurate environment model, and then plan to maximize reward. However, for complex domains, specifying a model by hand can be a time cons ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must obtain an accurate environment model, and then plan to maximize reward. However, for complex domains, specifying a model by hand can be a time consuming process. This motivates an alternative approach: learning a model directly from observations. Unfortunately, learning algorithms often recover a model that is too inaccurate to support planning or too large and complex for planning to succeed; or, they require excessive prior domain knowledge or fail to provide guarantees such as statistical consistency. To address this gap, we propose spectral subspace identification algorithms which provably learn compact, accurate, predictive models of partially observable dynamical systems directly from sequences of actionobservation pairs. Our research agenda includes several variations of this general approach: batch algorithms and online algorithms, kernelbased algorithms for learning models in high and infinitedimensional feature spaces, and manifoldbased identification algorithms. All of these approaches share a common framework: they are statistically consistent, computationally efficient, and easy to implement using established matrixalgebra techniques. Additionally, we show that our framework generalizes a variety of successful spectral
Spectral Machine Learning for Predicting Power Wheelchair Exercise Compliance
"... Abstract. Pressure ulcers are a common and devastating condition faced by users of power wheelchairs. However, proper use of power wheelchair tilt and recline functions can alleviate pressure and reduce the risk of ulcer occurrence. In this work, we show that when using data from a sensor instrumen ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Pressure ulcers are a common and devastating condition faced by users of power wheelchairs. However, proper use of power wheelchair tilt and recline functions can alleviate pressure and reduce the risk of ulcer occurrence. In this work, we show that when using data from a sensor instrumented power wheelchair, we are able to predict with an average accuracy of 92 % whether a subject will successfully complete a repositioning exercise when prompted. We present two models of compliance prediction. The first, a spectral Hidden Markov Model, uses fast, optimal optimization techniques to train a sequential classifier. The second, a decision tree using information gain, is computationally efficient and produces an output that is easy for clinicians and wheelchair users to understand. These prediction algorithms will be a key component in an intelligent reminding system that will prompt users to complete a repositioning exercise only in contexts in which the user is most likely to comply.
Compressed Predictive State Representation: An Efficient MomentMethod for Sequence Prediction and Sequential DecisionMaking
, 2014
"... iDedication This thesis is dedicated to my sister Julianna. iAcknowledgements I would like to thank everyone who helped and encouraged me, who constructively challenged my ideas or even just took the time to listen to them. I am deeply grateful to my supervisor Joelle Pineau for her continual, con ..."
Abstract
 Add to MetaCart
(Show Context)
iDedication This thesis is dedicated to my sister Julianna. iAcknowledgements I would like to thank everyone who helped and encouraged me, who constructively challenged my ideas or even just took the time to listen to them. I am deeply grateful to my supervisor Joelle Pineau for her continual, constructive guidance, to Doina Precup for providing invaluable advice throughout my BSc and MSc, to Mahdi Milani Fard for helping me to get the theory of CPSRs off the ground, and to Borja Balle for showing me a whole new perspective on my own work. Special thanks to lab members Clement Gehring, Ouais Alsharif, and Yuri Grinberg for engaging with me in countless stimulating discussions. ii The construction of accurate predictive models over sequence data is of
Tutor Dialogue Planning with Contextual Information and Discourse Structure
"... Abstract. In this paper, we present two techniques to improve the effectiveness of a conversational tutoring system when interacting with groups of users. First, we propose the usage of linguistic features derived from discourse parsing to better understand the structure of the ongoing dialogue be ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. In this paper, we present two techniques to improve the effectiveness of a conversational tutoring system when interacting with groups of users. First, we propose the usage of linguistic features derived from discourse parsing to better understand the structure of the ongoing dialogue between the students and the tutoring system. Discourse parsing gives us the relational structure of a dialogue, and can be used to improve the tutoring system’s understanding of conversation structure. Second, we discuss dialogue planning using contextual information stored in a Contextual Knowledge Base (CKB). A conversational agent with access to a information about linguistic and environmental context will be able to respond to changing conditions in ways that a simple question answering agent is not capable of. We present previous empirical results for these two tasks and discuss how they can be incorporated into an Intelligent Tutoring System.