Results 1  10
of
1,204,008
A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood
, 2003
"... The increase in the number of large data sets and the complexity of current probabilistic sequence evolution models necessitates fast and reliable phylogeny reconstruction methods. We describe a new approach, based on the maximumlikelihood principle, which clearly satisfies these requirements. The ..."
Abstract

Cited by 2109 (27 self)
 Add to MetaCart
of distancebased and parsimony approaches. The reduction of computing time is dramatic in comparison with other maximumlikelihood packages, while the likelihood maximization ability tends to be higher. For example, only 12 min were required on a standard personal computer to analyze a data set consisting
Simultaneous Multithreading: Maximizing OnChip Parallelism
, 1995
"... This paper examines simultaneous multithreading, a technique permitting several independent threads to issue instructions to a superscalar’s multiple functional units in a single cycle. We present several models of simultaneous multithreading and compare them with alternative organizations: a wide s ..."
Abstract

Cited by 805 (48 self)
 Add to MetaCart
superscalar, a finegrain multithreaded processor, and singlechip, multipleissue multiprocessing architectures. Our results show that both (singlethreaded) superscalar and finegrain multithreaded architectures are limited in their ability to utilize the resources of a wideissue processor. Simultaneous
A View Of The Em Algorithm That Justifies Incremental, Sparse, And Other Variants
 Learning in Graphical Models
, 1998
"... . The EM algorithm performs maximum likelihood estimation for data in which some variables are unobserved. We present a function that resembles negative free energy and show that the M step maximizes this function with respect to the model parameters and the E step maximizes it with respect to the d ..."
Abstract

Cited by 984 (18 self)
 Add to MetaCart
. The EM algorithm performs maximum likelihood estimation for data in which some variables are unobserved. We present a function that resembles negative free energy and show that the M step maximizes this function with respect to the model parameters and the E step maximizes it with respect
Hierarchical mixtures of experts and the EM algorithm
, 1993
"... We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood ..."
Abstract

Cited by 873 (21 self)
 Add to MetaCart
We present a treestructured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM’s). Learning is treated as a maximum likelihood
Bayesian Analysis of Stochastic Volatility Models
, 1994
"... this article is to develop new methods for inference and prediction in a simple class of stochastic volatility models in which logarithm of conditional volatility follows an autoregressive (AR) times series model. Unlike the autoregressive conditional heteroscedasticity (ARCH) and gener alized ARCH ..."
Abstract

Cited by 584 (25 self)
 Add to MetaCart
ARCH (GARCH) models [see Bollerslev, Chou, and Kroner (1992) for a survey of ARCH modeling], both the mean and logvolatility equations have separate error terms. The ease of evaluating the ARCH likelihood function and the ability of the ARCH specification to accommodate the timevarying volatility
Knowledge acquisition via incremental conceptual clustering
 Machine Learning
, 1987
"... hill climbing Abstract. Conceptual clustering is an important way of summarizing and explaining data. However, the recent formulation of this paradigm has allowed little exploration of conceptual clustering as a means of improving performance. Furthermore, previous work in conceptual clustering has ..."
Abstract

Cited by 757 (8 self)
 Add to MetaCart
not explicitly dealt with constraints imposed by real world environments. This article presents COBWEB, a conceptual clustering system that organizes data so as to maximize inference ability. Additionally, COBWEB is incremental and computationally economical, and thus can be flexibly applied in a variety
Accurate Methods for the Statistics of Surprise and Coincidence
 COMPUTATIONAL LINGUISTICS
, 1993
"... Much work has been done on the statistical analysis of text. In some cases reported in the literature, inappropriate statistical methods have been used, and statistical significance of results have not been addressed. In particular, asymptotic normality assumptions have often been used unjustifiably ..."
Abstract

Cited by 1044 (1 self)
 Add to MetaCart
unjustifiably, leading to flawed results.This assumption of normal distribution limits the ability to analyze rare events. Unfortunately rare events do make up a large fraction of real text.However, more applicable methods based on likelihood ratio tests are available that yield good results with relatively
Maximum entropy markov models for information extraction and segmentation
, 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many textrelated tasks, such as partofspeech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
Abstract

Cited by 553 (17 self)
 Add to MetaCart
as multinomial distributions over a discrete vocabulary, and the HMM parameters are set to maximize the likelihood of the observations. This paper presents a new Markovian sequence model, closely related to HMMs, that allows observations to be represented as arbitrary overlapping features (such as word
Training Products of Experts by Minimizing Contrastive Divergence
, 2002
"... It is possible to combine multiple latentvariable models of the same data by multiplying their probability distributions together and then renormalizing. This way of combining individual “expert ” models makes it hard to generate samples from the combined model but easy to infer the values of the l ..."
Abstract

Cited by 823 (75 self)
 Add to MetaCart
is unnecessary. Training a PoE by maximizing the likelihood of the data is difficult because it is hard even to approximate the derivatives of the renormalization term in the combination rule. Fortunately, a PoE can be trained using a different objective function called “contrastive divergence ” whose
Object class recognition by unsupervised scaleinvariant learning
 In CVPR
, 2003
"... We present a method to learn and recognize object class models from unlabeled and unsegmented cluttered scenes in a scale invariant manner. Objects are modeled as flexible constellations of parts. A probabilistic representation is used for all aspects of the object: shape, appearance, occlusion and ..."
Abstract

Cited by 1114 (49 self)
 Add to MetaCart
and relative scale. An entropybased feature detector is used to select regions and their scale within the image. In learning the parameters of the scaleinvariant object model are estimated. This is done using expectationmaximization in a maximumlikelihood setting. In recognition, this model is used in a
Results 1  10
of
1,204,008