Results 1 - 10
of
105
Penalized Maximum-Likelihood Image Reconstruction using Space-Alternating Generalized EM Algorithms
- IEEE Tr. Im. Proc
, 1995
"... Most expectation-maximization (EM) type algorithms for penalized maximum-likelihood image reconstruction converge slowly, particularly when one incorporates additive background effects such as scatter, random coincidences, dark current, or cosmic radiation. In addition, regularizing smoothness penal ..."
Abstract
-
Cited by 62 (25 self)
- Add to MetaCart
Most expectation-maximization (EM) type algorithms for penalized maximum-likelihood image reconstruction converge slowly, particularly when one incorporates additive background effects such as scatter, random coincidences, dark current, or cosmic radiation. In addition, regularizing smoothness penalties (or priors) introduce parameter coupling, rendering intractable the M-steps of most EM-type algorithms. This paper presents space-alternating generalized EM (SAGE) algorithms for image reconstruction, which update the parameters sequentially using a sequence of small "hidden" data spaces, rather than simultaneously using one large complete-data space. The sequential update decouples the M-step, so the maximization can typically be performed analytically. We introduce new hidden-data spaces that are less informative than the conventional completedata space for Poisson data and that yield significant improvements in convergence rate. This acceleration is due to statistical considerations, not numerical overrelaxation methods, so monotonic increases in the objective function are guaranteed. We provide a general global convergence proof for SAGE methods with nonnegativity constraints.
Analysis of multivariate probit models
- BIOMETRIKA
, 1998
"... This paper provides a practical simulation-based Bayesian and non-Bayesian analysis of correlated binary data using the multivariate probit model. The posterior distribution is simulated by Markov chain Monte Carlo methods and maximum likelihood estimates are obtained by a Monte Carlo version of the ..."
Abstract
-
Cited by 57 (3 self)
- Add to MetaCart
This paper provides a practical simulation-based Bayesian and non-Bayesian analysis of correlated binary data using the multivariate probit model. The posterior distribution is simulated by Markov chain Monte Carlo methods and maximum likelihood estimates are obtained by a Monte Carlo version of the EM algorithm. A practical approach for the computation of Bayes factors from the simulation output is also developed. The methods are applied to a dataset with a bivariate binary response, to a four-year longitudinal dataset from the Six Cities study of the health effects of air pollution and to a sevenvariate binary response dataset on the labour supply of married women from the Panel Survey of Income Dynamics.
Maximum Conditional Likelihood via Bound Maximization and the CEM Algorithm
- IN ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11
, 1998
"... We present the CEM (Conditional Expectation Maximization) algorithm as an extension of the EM (Expectation Maximization) algorithm to conditional density estimation under missing data. A bounding and maximization process is given to specifically optimize conditional likelihood instead of the usual j ..."
Abstract
-
Cited by 46 (8 self)
- Add to MetaCart
We present the CEM (Conditional Expectation Maximization) algorithm as an extension of the EM (Expectation Maximization) algorithm to conditional density estimation under missing data. A bounding and maximization process is given to specifically optimize conditional likelihood instead of the usual joint likelihood. Weapply the method to conditioned mixture models and use bounding techniques to derive the model's update rules. Monotonic convergence, computational efficiency and regression results superior to EM are demonstrated.
A tutorial on MM algorithms
- Amer. Statist
, 2004
"... Most problems in frequentist statistics involve optimization of a function such as a likelihood or a sum of squares. EM algorithms are among the most effective algorithms for maximum likelihood estimation because they consistently drive the likelihood uphill by maximizing a simple surrogate function ..."
Abstract
-
Cited by 36 (2 self)
- Add to MetaCart
Most problems in frequentist statistics involve optimization of a function such as a likelihood or a sum of squares. EM algorithms are among the most effective algorithms for maximum likelihood estimation because they consistently drive the likelihood uphill by maximizing a simple surrogate function for the loglikelihood. Iterative optimization of a surrogate function as exemplified by an EM algorithm does not necessarily require missing data. Indeed, every EM algorithm is a special case of the more general class of MM optimization algorithms, which typically exploit convexity rather than missing data in majorizing or minorizing an objective function. In our opinion, MM algorithms deserve to part of the standard toolkit of professional statisticians. The current article explains the principle behind MM algorithms, suggests some methods for constructing them, and discusses some of their attractive features. We include numerous examples throughout the article to illustrate the concepts described. In addition to surveying previous work on MM algorithms, this article introduces some new material on constrained optimization and standard error estimation. Key words and phrases: constrained optimization, EM algorithm, majorization, minorization, Newton-Raphson 1 1
Parameter expansion to accelerate EM: The PX-EM algorithm
, 1998
"... The EM algorithm and its extensions are popular tools for modal estimation but are often criticised for their slow convergence. We propose a new method that can often make EM much faster. The intuitive idea is to use a 'covariance adjustment ' to correct the analysis of the M step, capitalising on e ..."
Abstract
-
Cited by 32 (6 self)
- Add to MetaCart
The EM algorithm and its extensions are popular tools for modal estimation but are often criticised for their slow convergence. We propose a new method that can often make EM much faster. The intuitive idea is to use a 'covariance adjustment ' to correct the analysis of the M step, capitalising on extra information captured in the imputed complete data. The way we accomplish this is by parameter expansion; we expand the complete-data model while preserving the observed-data model and use the expanded complete-data model to generate EM. This parameter-expanded EM, PX-EM, algorithm shares the simplicity and stability of ordinary EM, but has a faster rate of convergence since its M step performs a more efficient analysis. The PX-EM algorithm is illustrated for the multivariate t distribution, a random effects model, factor analysis, probit regression and a Poisson imaging model.
Multiple imputation for multivariate missing-data problems: a data analyst's perspective
- Multivariate Behavioral Research
, 1998
"... Analyses of multivariate data are frequently hampered by missing values. Until re-cently, the only missing-data methods available to most data analysts have been relatively ad hoc practices such as listwise deletion. Recent dramatic advances in theoretical and com-putational statistics, however, hav ..."
Abstract
-
Cited by 30 (0 self)
- Add to MetaCart
Analyses of multivariate data are frequently hampered by missing values. Until re-cently, the only missing-data methods available to most data analysts have been relatively ad hoc practices such as listwise deletion. Recent dramatic advances in theoretical and com-putational statistics, however, have produced a new generation of flexible procedures with a sound statistical basis. These procedures involve multiple imputation (Rubin, 1987), a simu-lation technique that replaces each missing datum with a set of m>1 plausible values. The m versions of the complete data are analyzed by standard complete-data methods, and the results are combined using simple rules to yield estimates, standard errors, and p-values that formally incorporate missing-data uncertainty. New computational algorithms and software described in a recent book (Schafer, 1997) allow us to create proper multiple imputations in complex multivariate settings. This article reviews the key ideas of multiple imputation, discusses the software programs currently available, and demonstrates their use on data from
Accelerating EM for large databases
- Machine Learning
, 2001
"... The EM algorithm is a popular method for parameter estimation in a variety of problems involving missing data. However, the EM algorithm often requires signi cant computational resources and has been dismissed as impractical for large databases. We presenttwo approaches that signi cantly reduce the ..."
Abstract
-
Cited by 27 (1 self)
- Add to MetaCart
The EM algorithm is a popular method for parameter estimation in a variety of problems involving missing data. However, the EM algorithm often requires signi cant computational resources and has been dismissed as impractical for large databases. We presenttwo approaches that signi cantly reduce the computational cost of applying the EM algorithm to databases with a large number of cases, including databases with large dimensionality. Both approaches are based on partial E-steps for which we can use the results of Neal and Hinton (1998) to obtain the standard convergence guarantees of EM. The rst approach is a version of the incremental EM, described in Neal and Hinton (1998), which cycles through data cases in blocks. The number of cases in each block dramatically e ects the e ciency of the algorithm. We provide a method for selecting a near optimal block size. The second approach, which we call lazy EM, will, at scheduled iterations, evaluate the signi cance of each data case and then proceed for several iterations actively using only the signi cant cases. We demonstrate that both methods can signi cantly reduce computational costs through their application to high-dimensional real-world and synthetic mixture modeling problems for large databases. Keywords: Expectation Maximization Algorithm, incremental EM, lazy EM, online EM, data blocking, mixture models, clustering.
What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA
"... This paper presents a syntax-driven approach to question answering, specifically the answer-sentence selection problem for short-answer questions. Rather than using syntactic features to augment existing statistical classifiers (as in previous work), we build on the idea that questions and their (co ..."
Abstract
-
Cited by 24 (9 self)
- Add to MetaCart
This paper presents a syntax-driven approach to question answering, specifically the answer-sentence selection problem for short-answer questions. Rather than using syntactic features to augment existing statistical classifiers (as in previous work), we build on the idea that questions and their (correct) answers relate to each other via loose but predictable syntactic transformations. We propose a probabilistic quasi-synchronous grammar, inspired by one proposed for machine translation (D. Smith and Eisner, 2006), and parameterized by mixtures of a robust nonlexical syntax/alignment model with a(n optional) lexical-semantics-driven log-linear model. Our model learns soft alignments as a hidden variable in discriminative training. Experimental results using the TREC dataset are shown to significantly outperform strong state-of-the-art baselines. 1
A Component-wise EM Algorithm for Mixtures
, 1999
"... In some situations, EM algorithm shows slow convergence problems. One possible reason is that standard procedures update the parameters simultaneously. In this paper we focus on nite mixture estimation. In this framework, we propose a component-wise EM, which updates the parameters sequentially. We ..."
Abstract
-
Cited by 23 (2 self)
- Add to MetaCart
In some situations, EM algorithm shows slow convergence problems. One possible reason is that standard procedures update the parameters simultaneously. In this paper we focus on nite mixture estimation. In this framework, we propose a component-wise EM, which updates the parameters sequentially. We give an interpretation of this procedure as a proximal point algorithm and use it to prove the convergence. Illustrative numerical experiments show how our algorithm compares to EM and a version of the SAGE algorithm.

