Results

**11 - 14**of**14**### Estimating Mixture Models via Mixtures of Polynomials

"... Mixture modeling is a general technique for making any simple model more ex-pressive through weighted combination. This generality and simplicity in part explains the success of the Expectation Maximization (EM) algorithm, in which updates are easy to derive for a wide class of mixture models. Howev ..."

Abstract
- Add to MetaCart

(Show Context)
Mixture modeling is a general technique for making any simple model more ex-pressive through weighted combination. This generality and simplicity in part explains the success of the Expectation Maximization (EM) algorithm, in which updates are easy to derive for a wide class of mixture models. However, the likeli-hood of a mixture model is non-convex, so EM has no known global convergence guarantees. Recently, method of moments approaches offer global guarantees for some mixture models, but they do not extend easily to the range of mixture mod-els that exist. In this work, we present Polymom, an unifying framework based on method of moments in which estimation procedures are easily derivable, just as in EM. Polymom is applicable when the moments of a single mixture component are polynomials of the parameters. Our key observation is that the moments of the mixture model are a mixture of these polynomials, which allows us to cast estimation as a Generalized Moment Problem. We solve its relaxations using semidefinite optimization, and then extract parameters using ideas from computer algebra. This framework allows us to draw insights and apply tools from convex optimization, computer algebra and the theory of moments to study problems in statistical estimation. Simulations show good empirical performance on several models. 1

### London WC2R 2LS

"... Standard models of language learning are concerned with weak learning: the learner, receiving as input only information about the strings in the language, must learn to generalise and to generate the correct, potentially infinite, set of strings generated by some target grammar. Here we define the c ..."

Abstract
- Add to MetaCart

(Show Context)
Standard models of language learning are concerned with weak learning: the learner, receiving as input only information about the strings in the language, must learn to generalise and to generate the correct, potentially infinite, set of strings generated by some target grammar. Here we define the corresponding notion of strong learning: the learner, again only receiving strings as input, must learn a grammar that generates the correct set of structures or parse trees. We formalise this using a modification of Gold’s identification in the limit model, requiring convergence to a grammar that is isomorphic to the target grammar. We take as our starting point a simple learning algorithm for substitutable context-free languages, based on principles of distributional learning, and modify it so that it will converge to a canonical grammar for each language. We prove a corresponding strong learning result for a subclass of context-free grammars.

### A Learnability Analysis of Argument and Modifier Structure

"... We present a computational learnability analysis of the argument-modifier distinction, asking whether information present in the distribution of constituents in natural language supports the distinction and its learnability. We first develop general models of those aspects of argument structure and ..."

Abstract
- Add to MetaCart

We present a computational learnability analysis of the argument-modifier distinction, asking whether information present in the distribution of constituents in natural language supports the distinction and its learnability. We first develop general models of those aspects of argument structure and the argument-modifier distinction which have effects on the distribution of constituents in sentences—abstracting away many of the implementational details of specific theoretical proposals. Combining these models with a theory of learning based on succinctness, we define two systems, the argument-only (PTSG) model and the argument-modifier (PSAG) model. We first show that the argument-modifier (PSAG) model is able to recover the argument-modifier status of many individual constituents when evaluated against a gold standard. This provides evidence in favor of our general account of argument-modifier structure as well as providing a lower bound on the amount of information that natural language input can provide for appropriately equipped learners attempting to recover the argument-modifier status of individual constituents.