• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 129,382
Next 10 →

Algorithms for Non-negative Matrix Factorization

by Daniel D. Lee, H. Sebastian Seung - In NIPS , 2001
"... Non-negative matrix factorization (NMF) has previously been shown to be a useful decomposition for multivariate data. Two different multiplicative algorithms for NMF are analyzed. They differ only slightly in the multiplicative factor used in the update rules. One algorithm can be shown to minim ..."
Abstract - Cited by 1230 (5 self) - Add to MetaCart
. The algorithms can also be interpreted as diagonally rescaled gradient descent, where the rescaling factor is optimally chosen to ensure convergence.

Rescaled

by unknown authors
"... weighted random balls models and stable self-similar random fields∗ Jean-Christophe Breton † and Clment Dombry‡ We consider weighted random balls in Rd distributed according to a random Poisson measure with heavy tailed intensity and study the asymptotic behavior of the total weight of some configur ..."
Abstract - Add to MetaCart
configurations in Rd while we perform a zooming opera-tion. The resulting procedure is very rich and several regimes appear in the limit, depending on the intensity of the balls, the zooming factor, the tail parameters of the radii and the weights. Statistical properties of the limit fields are also evidenced

Determining the Number of Factors in Approximate Factor Models

by Jushan Bai, Serena Ng , 2000
"... In this paper we develop some statistical theory for factor models of large dimensions. The focus is the determination of the number of factors, which is an unresolved issue in the rapidly growing literature on multifactor models. We propose a panel Cp criterion and show that the number of factors c ..."
Abstract - Cited by 538 (29 self) - Add to MetaCart
In this paper we develop some statistical theory for factor models of large dimensions. The focus is the determination of the number of factors, which is an unresolved issue in the rapidly growing literature on multifactor models. We propose a panel Cp criterion and show that the number of factors

hep-ph/0406044 New uncertainties in QCD-QED rescaling factors using Quadrature Method

by Mahadev Patgiri, N. Nimai Singh , 2004
"... In this paper we briefly outline the quadrature method for estimating uncertainties in a function which depends on several variables, and apply it to estimate the numerical uncertainties in QCD-QED rescaling factors. We employ here the one-loop order in QED and three-loop order in QCD evolution equa ..."
Abstract - Add to MetaCart
In this paper we briefly outline the quadrature method for estimating uncertainties in a function which depends on several variables, and apply it to estimate the numerical uncertainties in QCD-QED rescaling factors. We employ here the one-loop order in QED and three-loop order in QCD evolution

Probabilistic Principal Component Analysis

by Michael E. Tipping, Chris M. Bishop - Journal of the Royal Statistical Society, Series B , 1999
"... Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximum-likelihood estimation of paramet ..."
Abstract - Cited by 703 (5 self) - Add to MetaCart
of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss, with illustrative examples, the advantages conveyed by this probabilistic approach

An extended set of Haar-like features for rapid objection detection

by Rainer Lienhart, Jochen Maydt - IEEE ICIP
"... Recently Viola et al. [5] have introduced a rapid object detection scheme based on a boosted cascade of simple feature classifiers. In this paper we introduce a novel set of rotated haar-like features. These novel features significantly enrich the simple features of [5] and can also be calculated ef ..."
Abstract - Cited by 567 (4 self) - Add to MetaCart
Recently Viola et al. [5] have introduced a rapid object detection scheme based on a boosted cascade of simple feature classifiers. In this paper we introduce a novel set of rotated haar-like features. These novel features significantly enrich the simple features of [5] and can also be calculated efficiently. With these new rotated features our sample face detector shows off on average a 10 % lower false alarm rate at a given hit rate. We also present a novel post optimization procedure for a given boosted cascade improving on average the false alarm rate further by 12.5%. 1

Mixtures of Probabilistic Principal Component Analysers

by Michael E. Tipping, Christopher M. Bishop , 1998
"... Principal component analysis (PCA) is one of the most popular techniques for processing, compressing and visualising data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a com ..."
Abstract - Cited by 537 (6 self) - Add to MetaCart
Principal component analysis (PCA) is one of the most popular techniques for processing, compressing and visualising data, although its effectiveness is limited by its global linearity. While nonlinear variants of PCA have been proposed, an alternative paradigm is to capture data complexity by a combination of local linear PCA projections. However, conventional PCA does not correspond to a probability density, and so there is no unique way to combine PCA models. Previous attempts to formulate mixture models for PCA have therefore to some extent been ad hoc. In this paper, PCA is formulated within a maximum-likelihood framework, based on a specific form of Gaussian latent variable model. This leads to a well-defined mixture model for probabilistic principal component analysers, whose parameters can be determined using an EM algorithm. We discuss the advantages of this model in the context of clustering, density modelling and local dimensionality reduction, and we demonstrate its applicat...

Estimation and Inference in Econometrics

by James G. Mackinnon , 1993
"... The astonishing increase in computer performance over the past two decades has made it possible for economists to base many statistical inferences on simulated, or bootstrap, distributions rather than on distributions obtained from asymptotic theory. In this paper, I review some of the basic ideas o ..."
Abstract - Cited by 1151 (3 self) - Add to MetaCart
The astonishing increase in computer performance over the past two decades has made it possible for economists to base many statistical inferences on simulated, or bootstrap, distributions rather than on distributions obtained from asymptotic theory. In this paper, I review some of the basic ideas of bootstrap inference. The paper discusses Monte Carlo tests, several types of bootstrap test, and bootstrap confidence intervals. Although bootstrapping often works well, it does not do so in every case.

Large margin methods for structured and interdependent output variables

by Ioannis Tsochantaridis, Thorsten Joachims, Thomas Hofmann, Yasemin Altun - JOURNAL OF MACHINE LEARNING RESEARCH , 2005
"... Learning general functional dependencies between arbitrary input and output spaces is one of the key challenges in computational intelligence. While recent progress in machine learning has mainly focused on designing flexible and powerful input representations, this paper addresses the complementary ..."
Abstract - Cited by 612 (12 self) - Add to MetaCart
Learning general functional dependencies between arbitrary input and output spaces is one of the key challenges in computational intelligence. While recent progress in machine learning has mainly focused on designing flexible and powerful input representations, this paper addresses the complementary issue of designing classification algorithms that can deal with more complex outputs, such as trees, sequences, or sets. More generally, we consider problems involving multiple dependent output variables, structured output spaces, and classification problems with class attributes. In order to accomplish this, we propose to appropriately generalize the well-known notion of a separation margin and derive a corresponding maximum-margin formulation. While this leads to a quadratic program with a potentially prohibitive, i.e. exponential, number of constraints, we present a cutting plane algorithm that solves the optimization problem in polynomial time for a large class of problems. The proposed method has important applications in areas such as computational biology, natural language processing, information retrieval/extraction, and optical character recognition. Experiments from various domains involving different types of output spaces emphasize the breadth and generality of our approach.

Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expression Data

by Terrence S. Furey, Nello Cristianini, Nigel Duffy, David W. Bednarski, Michèl Schummer, David Haussler , 2000
"... Motivation: DNA microarray experiments generating thousands of gene expression measurements, are being used to gather information from tissue and cell samples regarding gene expression differences that will be useful in diagnosing disease. We have developed a new method to analyse this kind of data ..."
Abstract - Cited by 566 (1 self) - Add to MetaCart
Motivation: DNA microarray experiments generating thousands of gene expression measurements, are being used to gather information from tissue and cell samples regarding gene expression differences that will be useful in diagnosing disease. We have developed a new method to analyse this kind of data using support vector machines (SVMs). This analysis consists of both classification of the tissue samples, and an exploration of the data for mis-labeled or questionable tissue results. Results: We demonstrate the method in detail on samples consisting of ovarian cancer tissues, normal ovarian tissues, and other normal tissues. The dataset consists of expression experiment results for 97 802 cDNAs for each tissue. As a result of computational analysis, a tissue sample is discovered and confirmed to be wrongly labeled. Upon correction of this mistake and the removal of an outlier, perfect classification of tissues is achieved, but not with high confidence. We identify and analyse a subset of genes from the ovarian dataset whose expression is highly differentiated between the types of tissues. To show robustness of the SVM method, two previously published datasets from other types of tissues or cells are analysed. The results are comparable to those previously obtained. We show that other machine learning methods also perform comparably to the SVM on many of those datasets. Availability: The SVM software is available at http:// www. cs.columbia.edu/#bgrundy/svm. Contact: booch@cse.ucsc.edu
Next 10 →
Results 1 - 10 of 129,382
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University