Results 1  10
of
49,855
Estimating the Support of a HighDimensional Distribution
, 1999
"... Suppose you are given some dataset drawn from an underlying probability distribution P and you want to estimate a "simple" subset S of input space such that the probability that a test point drawn from P lies outside of S is bounded by some a priori specified between 0 and 1. We propo ..."
Abstract

Cited by 769 (29 self)
 Add to MetaCart
of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a preliminary theoretical analysis of the statistical performance of our
The SRtree: An Index Structure for HighDimensional Nearest Neighbor Queries
, 1997
"... Recently, similarity queries on feature vectors have been widely used to perform contentbased retrieval of images. To apply this technique to large databases, it is required to develop multidimensional index structures supporting nearest neighbor queries e ciently. The SStree had been proposed for ..."
Abstract

Cited by 438 (3 self)
 Add to MetaCart
Recently, similarity queries on feature vectors have been widely used to perform contentbased retrieval of images. To apply this technique to large databases, it is required to develop multidimensional index structures supporting nearest neighbor queries e ciently. The SStree had been proposed
Computing semantic relatedness using Wikipediabased explicit semantic analysis
 In Proceedings of the 20th International Joint Conference on Artificial Intelligence
, 2007
"... Computing semantic relatedness of natural language texts requires access to vast amounts of commonsense and domainspecific world knowledge. We propose Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a highdimensional space of concepts derived from Wikipedi ..."
Abstract

Cited by 547 (9 self)
 Add to MetaCart
Computing semantic relatedness of natural language texts requires access to vast amounts of commonsense and domainspecific world knowledge. We propose Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a highdimensional space of concepts derived from
Training Linear SVMs in Linear Time
, 2006
"... Linear Support Vector Machines (SVMs) have become one of the most prominent machine learning techniques for highdimensional sparse data commonly encountered in applications like text classification, wordsense disambiguation, and drug design. These applications involve a large number of examples n ..."
Abstract

Cited by 539 (6 self)
 Add to MetaCart
Linear Support Vector Machines (SVMs) have become one of the most prominent machine learning techniques for highdimensional sparse data commonly encountered in applications like text classification, wordsense disambiguation, and drug design. These applications involve a large number of examples n
Maxmargin Markov networks
, 2003
"... In typical classification tasks, we seek a function which assigns a label to a single object. Kernelbased approaches, such as support vector machines (SVMs), which maximize the margin of confidence of the classifier, are the method of choice for many such tasks. Their popularity stems both from the ..."
Abstract

Cited by 592 (15 self)
 Add to MetaCart
In typical classification tasks, we seek a function which assigns a label to a single object. Kernelbased approaches, such as support vector machines (SVMs), which maximize the margin of confidence of the classifier, are the method of choice for many such tasks. Their popularity stems both from
Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score
, 2000
"... We are interested in estimating the average effect of a binary treatment on a scalar outcome. If assignment to the treatment is independent of the potential outcomes given pretreatment variables, biases associated with simple treatmentcontrol average comparisons can be removed by adjusting for diff ..."
Abstract

Cited by 398 (34 self)
 Add to MetaCart
treatment variables. Thus it is possible to obtain unbiased estimates of the treatment effect without conditioning on a possibly highdimensional vector of pretreatment variables. Although adjusting for the propensity score removes all the bias, this can come at the expense of efficiency. We show that weighting
Concept Decompositions for Large Sparse Text Data using Clustering
 Machine Learning
, 2000
"... . Unlabeled document collections are becoming increasingly common and available; mining such data sets represents a major contemporary challenge. Using words as features, text documents are often represented as highdimensional and sparse vectorsa few thousand dimensions and a sparsity of 95 to 99 ..."
Abstract

Cited by 401 (27 self)
 Add to MetaCart
. Unlabeled document collections are becoming increasingly common and available; mining such data sets represents a major contemporary challenge. Using words as features, text documents are often represented as highdimensional and sparse vectorsa few thousand dimensions and a sparsity of 95
The TVtree  an index structure for highdimensional data
 VLDB Journal
, 1994
"... We propose a file structure to index highdimensionality data, typically, points in some feature space. The idea is to use only a few of the features, utilizing additional features whenever the additional discriminatory power is absolutely necessary. We present in detail the design of our tree struc ..."
Abstract

Cited by 216 (7 self)
 Add to MetaCart
We propose a file structure to index highdimensionality data, typically, points in some feature space. The idea is to use only a few of the features, utilizing additional features whenever the additional discriminatory power is absolutely necessary. We present in detail the design of our tree
Lassotype recovery of sparse representations for highdimensional data
 ANNALS OF STATISTICS
, 2009
"... The Lasso is an attractive technique for regularization and variable selection for highdimensional data, where the number of predictor variables pn is potentially much larger than the number of samples n. However, it was recently discovered that the sparsity pattern of the Lasso estimator can only ..."
Abstract

Cited by 246 (14 self)
 Add to MetaCart
The Lasso is an attractive technique for regularization and variable selection for highdimensional data, where the number of predictor variables pn is potentially much larger than the number of samples n. However, it was recently discovered that the sparsity pattern of the Lasso estimator can only
HighDimensional Data Analysis: The Curses and Blessings of Dimensionality
, 2000
"... The coming century is surely the century of data. A combination of blind faith and serious purpose makes our society invest massively in the collection and processing of data of all kinds, on scales unimaginable until recently. Hyperspectral Imagery, Internet Portals, Financial tickbytick data, an ..."
Abstract

Cited by 167 (0 self)
 Add to MetaCart
being a vector of values we measured on several variables (e.g. blood pressure, weight, height,...). In traditional statistical methodology, we assumed many observations and a few, wellchosen variables. The trend today is towards more observations but even more so, to radically larger numbers
Results 1  10
of
49,855