Results 1  10
of
169,791
Text Classification from Labeled and Unlabeled Documents using EM
 MACHINE LEARNING
, 1999
"... This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. This is important because in many text classification problems obtaining training labels is expensive, while large qua ..."
Abstract

Cited by 1033 (19 self)
 Add to MetaCart
quantities of unlabeled documents are readily available. We introduce an algorithm for learning from labeled and unlabeled documents based on the combination of ExpectationMaximization (EM) and a naive Bayes classifier. The algorithm first trains a classifier using the available labeled documents
Segmentation of brain MR images through a hidden Markov random field model and the expectationmaximization algorithm
 IEEE TRANSACTIONS ON MEDICAL. IMAGING
, 2001
"... The finite mixture (FM) model is the most commonly used model for statistical segmentation of brain magnetic resonance (MR) images because of its simple mathematical form and the piecewise constant nature of ideal brain MR images. However, being a histogrambased model, the FM has an intrinsic limi ..."
Abstract

Cited by 619 (14 self)
 Add to MetaCart
methods are limited to using MRF as a general prior in an FM modelbased approach. To fit the HMRF model, an EM algorithm is used. We show that by incorporating both the HMRF model and the EM algorithm into a HMRFEM framework, an accurate and robust segmentation can be achieved. More importantly
Probabilistic Principal Component Analysis
 Journal of the Royal Statistical Society, Series B
, 1999
"... Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximumlikelihood estimation of paramet ..."
Abstract

Cited by 703 (5 self)
 Add to MetaCart
of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss, with illustrative examples, the advantages conveyed by this probabilistic approach
Gravity with Gravitas: a Solution to the Border Puzzle
, 2001
"... Gravity equations have been widely used to infer trade ow effects of various institutional arrangements. We show that estimated gravity equations do not have a theoretical foundation. This implies both that estimation suffers from omitted variables bias and that comparative statics analysis is unfo ..."
Abstract

Cited by 610 (3 self)
 Add to MetaCart
Gravity equations have been widely used to infer trade ow effects of various institutional arrangements. We show that estimated gravity equations do not have a theoretical foundation. This implies both that estimation suffers from omitted variables bias and that comparative statics analysis
A Systematic Comparison of Various Statistical Alignment Models
 COMPUTATIONAL LINGUISTICS
, 2003
"... ..."
Automatic Word Sense Discrimination
 Journal of Computational Linguistics
, 1998
"... This paper presents contextgroup discrimination, a disambiguation algorithm based on clustering. Senses are interpreted as groups (or clusters) of similar contexts of the ambiguous word. Words, contexts, and senses are represented in Word Space, a highdimensional, realvalued space in which closen ..."
Abstract

Cited by 530 (1 self)
 Add to MetaCart
This paper presents contextgroup discrimination, a disambiguation algorithm based on clustering. Senses are interpreted as groups (or clusters) of similar contexts of the ambiguous word. Words, contexts, and senses are represented in Word Space, a highdimensional, realvalued space in which closeness corresponds to semantic similarity. Similarity in Word Space is based on secondorder cooccurrence: two tokens (or contexts) of the ambiguous word are assigned to the same sense cluster if the words they cooccur with in turn occur with similar words in a training corpus. The algorithm is automatic and unsupervised in both training and application: senses are induced from a corpus without labeled training insta,nces or other external knowledge sources. The paper demonstrates good performance of contextgroup discrimination for a sample of natural and artificial ambiguous words
Closedform solution of absolute orientation using unit quaternions
 J. Opt. Soc. Am. A
, 1987
"... Finding the relationship between two coordinate systems using pairs of measurements of the coordinates of a number of points in both systems is a classic photogrammetric task. It finds applications in stereophotogrammetry and in robotics. I present here a closedform solution to the leastsquares pr ..."
Abstract

Cited by 973 (4 self)
 Add to MetaCart
Finding the relationship between two coordinate systems using pairs of measurements of the coordinates of a number of points in both systems is a classic photogrammetric task. It finds applications in stereophotogrammetry and in robotics. I present here a closedform solution to the leastsquares problem for three or more points. Currently various empirical, graphical, and numerical iterative methods are in use. Derivation of the solution is simplified by use of unit quaternions to represent rotation. I emphasize a symmetry property that a solution to this problem ought to possess. The best translational offset is the difference between the centroid of the coordinates in one system and the rotated and scaled centroid of the coordinates in the other system. The best scale is equal to the ratio of the rootmeansquare deviations of the coordinates in the two systems from their respective centroids. These exact results are to be preferred to approximate methods based on measurements of a few selected points. The unit quaternion representing the best rotation is the eigenvector associated with the most positive eigenvalue of a symmetric 4 X 4 matrix. The elements of this matrix are combinations of sums of products of corresponding coordinates of the points. 1.
Missing data: Our view of the state of the art
 Psychological Methods
, 2002
"... Statistical procedures for missing data have vastly improved, yet misconception and unsound practice still abound. The authors frame the missingdata problem, review methods, offer advice, and raise issues that remain unresolved. They clear up common misunderstandings regarding the missing at random ..."
Abstract

Cited by 689 (1 self)
 Add to MetaCart
Statistical procedures for missing data have vastly improved, yet misconception and unsound practice still abound. The authors frame the missingdata problem, review methods, offer advice, and raise issues that remain unresolved. They clear up common misunderstandings regarding the missing at random (MAR) concept. They summarize the evidence against older procedures and, with few exceptions, discourage their use. They present, in both technical and practical language, 2 general approaches that come highly recommended: maximum likelihood (ML) and Bayesian multiple imputation (MI). Newer developments are discussed, including some for dealing with missing data that are not MAR. Although not yet in the mainstream, these procedures may eventually extend the ML and MI methods that currently represent the state of the art. Why do missing data create such difficulty in scientific research? Because most data analysis procedures were not designed for them. Missingness is usually a nuisance, not the main focus of inquiry, but
Selfdiscrepancy: A theory relating self and affect
 Psychological Review
, 1987
"... This article presents a theory of how different types of discrepancies between selfstate representations are related to different kinds of emotional vulnerabilities. One domain of the self (actual; ideal; ought) and one standpoint on the self (own; significant other) constitute each type of selfs ..."
Abstract

Cited by 567 (7 self)
 Add to MetaCart
This article presents a theory of how different types of discrepancies between selfstate representations are related to different kinds of emotional vulnerabilities. One domain of the self (actual; ideal; ought) and one standpoint on the self (own; significant other) constitute each type of selfstate representation. It is proposed that different types of selfdiscrepancies represent different types of negative psychological situations that are associated with different kinds of discomfort. Discrepancies between the actual/own selfstate (i.e., the selfconcept) and ideal selfstales (i.e., representations of an individual's beliefs about his or her own or a significant other's hopes, wishes, or aspirations for the individual) signify the absence of positive outcomes, which is associated with dejectionrelated emotions (e.g., disappointment, dissatisfaction, sadness). In contrast, discrepancies between the actual/own selfstate and ought selfstates (i.e., representations of an individual's beliefs about his or her own or a significant other's beliefs about the individual's duties, responsibilities, or obligations) signify the presence of negative outcomes, which is associated with agitationrelated emotions (e.g., fear, threat, restlessness). Differences in both the relative magnitude and the accessibility of individuals ' available types of selfdiscrepancies are predicted to be related to differences in the kinds of discomfort people are likely to experience. Correlational and experimental evidence supports the predictions of the model. Differences between serfdiscrepancy theory and (a) other theories of incompatible selfbeliefs and (b) actual self negativity (e.g., low selfesteem) are discussed. The notion that people who hold conflicting or incompatible beliefs are likely to experience discomfort has had a long history in psychology. In social psychology, for example, various early theories proposed a relation between discomfort and specific kinds of "inconsistency " among a person's beliefs (e.g., Abelson
Results 1  10
of
169,791