Direct least Square Fitting of Ellipses
, 1998
"... This work presents a new efficient method for fitting ellipses to scattered data. Previous algorithms either fitted general conics or were computationally expensive. By minimizing the algebraic distance subject to the constraint 4ac  b² = 1 the new method incorporates the ellipticity constraint ..."
This work presents a new efficient method for fitting ellipses to scattered data. Previous algorithms either fitted general conics or were computationally expensive. By minimizing the algebraic distance subject to the constraint 4ac  b² = 1 the new method incorporates the ellipticity constraint
Unsupervised Learning by Probabilistic Latent Semantic Analysis
 Machine Learning
, 2001
"... Abstract. This paper presents a novel statistical method for factor analysis of binary and count data which is closely related to a technique known as Latent Semantic Analysis. In contrast to the latter method which stems from linear algebra and performs a Singular Value Decomposition of cooccurren ..."
Maximization algorithm for model fitting, which has shown excellent performance in practice. Probabilistic Latent Semantic Analysis has many applications, most prominently in information retrieval, natural language processing, machine learning from text, and in related areas. The paper presents perplexity
Typesetting Concrete Mathematics
 TUGBOAT
, 1989
"... ... tried my best to make the book mathematically interesting, but I also knew that it would be typographically interestingbecause it would be the first major use of a new typeface by Hermann Zapf, commissioned by the American Mathematical Society. This typeface, called AMS Euler, had been carefull ..."
. The underlying philosophy of Zapf's Euler design was to capture the flavor of mathematics as it might be written by a mathematician with excellent handwriting. For example, one of the earmarks of AMS Euler is its zero, 'O', which is slightly pointed at the top because a handwritten zero rarely
Powerlaw distributions in empirical data
 ISSN 00361445. doi: 10.1137/ 070710111. URL http://dx.doi.org/10.1137/070710111
, 2009
"... Powerlaw distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and manmade phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the t ..."
in the tail of the distribution. In particular, standard methods such as leastsquares fitting are known to produce systematically biased estimates of parameters for powerlaw distributions and should not be used in most circumstances. Here we describe statistical techniques for making accurate parameter
Mining Sequential Patterns
, 1995
"... We are given a large database of customer transactions, where each transaction consists of customerid, transaction time, and the items bought in the transaction. We introduce the problem of mining sequential patterns over such databases. We present three algorithms to solve this problem, and empiri ..."
that both AprioriSome and AprioriAll scale linearly with the number of customer transactions. They also have excellent scaleup properties with respect to the number of transactions per customer and the number of items in a transaction.
Probabilistic Latent Semantic Analysis
 In Proc. of Uncertainty in Artificial Intelligence, UAI’99
, 1999
"... Probabilistic Latent Semantic Analysis is a novel statistical technique for the analysis of twomode and cooccurrence data, which has applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. Compared to standard Latent Sema ..."
to avoid overfitting, we propose a widely applicable generalization of maximum likelihood model fitting by tempered EM. Our approach yields substantial and consistent improvements over Latent Semantic Analysis in a number of experiments.
Maximum entropy markov models for information extraction and segmentation
, 2000
"... Hidden Markov models (HMMs) are a powerful probabilistic tool for modeling sequential data, and have been applied with success to many textrelated tasks, such as partofspeech tagging, text segmentation and information extraction. In these cases, the observations are usually modeled as multinomial ..."
, capitalization, formatting, partofspeech), and defines the conditional probability of state sequences given observation sequences. It does this by using the maximum entropy framework to fit a set of exponential models that represent the probability of a state given an observation and the previous state. We
Probabilistic Latent Semantic Indexing
, 1999
"... Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized ..."
Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fitted from a training corpus of text documents by a generalization of the Expectation Maximization algorithm, the utilized
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews
, 2002
"... This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (thumbs down). The classification of a review is predicted by the average semantic orientation of the phrases in the review that contain adjectives or adverbs. A ..."
information between the given phrase and the word "excellent" minus the mutual information between the given phrase and the word "poor". A review is classified as recommended if the average semantic orientation of its phrases is positive. The algorithm achieves an average accuracy
Exokernel: An Operating System Architecture for ApplicationLevel Resource Management
, 1995
"... We describe an operating system architecture that securely multiplexes machine resources while permitting an unprecedented degree of applicationspecific customization of traditional operating system abstractions. By abstracting physical hardware resources, traditional operating systems have signifi ..."
–50 times faster than Ultrix’s primitives. These results demonstrate that the exokernel operating system design is practical and offers an excellent combination of performance and flexibility. 1
