Results 1  10
of
52
Dequantizing compressed sensing: When oversampling and nongaussian constraints combine
, 2009
"... ..."
Using the Delta Test for Variable Selection
"... Abstract. Input selection is an important consideration in all largescale modelling problems. We propose that using an established noise variance estimator known as the Delta test as the target to minimise can provide an effective input selection methodology. Theoretical justifications and experime ..."
Abstract

Cited by 20 (6 self)
 Add to MetaCart
(Show Context)
Abstract. Input selection is an important consideration in all largescale modelling problems. We propose that using an established noise variance estimator known as the Delta test as the target to minimise can provide an effective input selection methodology. Theoretical justifications and experimental results are presented. 1
Nearest Neighbors in HighDimensional Data: The Emergence and Influence of Hubs
"... High dimensionality can pose severe difficulties, widely recognized as different aspects of the curse of dimensionality. In this paper we study a new aspect of the curse pertaining to the distribution of koccurrences, i.e., the number of times a point appears among the k nearest neighbors of other ..."
Abstract

Cited by 16 (4 self)
 Add to MetaCart
(Show Context)
High dimensionality can pose severe difficulties, widely recognized as different aspects of the curse of dimensionality. In this paper we study a new aspect of the curse pertaining to the distribution of koccurrences, i.e., the number of times a point appears among the k nearest neighbors of other points in a data set. We show that, as dimensionality increases, this distribution becomes considerably skewed and hub points emerge (points with very high koccurrences). We examine the origin of this phenomenon, showing that it is an inherent property of highdimensional vector space, and explore its influence on applications based on measuring distances in vector spaces, notably classification, clustering, and information retrieval. 1.
Faster Retrieval with a TwoPass DynamicTimeWarping Lower Bound
, 2009
"... The Dynamic Time Warping (DTW) is a popular similarity measure between time series. The DTW fails to satisfy the triangle inequality and its computation requires quadratic time. Hence, to find closest neighbors quickly, we use bounding techniques. We can avoid most DTW computations with an inexpensi ..."
Abstract

Cited by 13 (0 self)
 Add to MetaCart
The Dynamic Time Warping (DTW) is a popular similarity measure between time series. The DTW fails to satisfy the triangle inequality and its computation requires quadratic time. Hence, to find closest neighbors quickly, we use bounding techniques. We can avoid most DTW computations with an inexpensive lower bound (LB Keogh). We compare LB Keogh with a tighter lower bound (LB Improved). We find that LB Improvedbased search is faster. As an example, our approach is 2–3 times faster over randomwalk and shape time series.
Dequantized compressed sensing with nonGaussian constraints.” arXiv:0902.2367v2 [math.OC
, 2009
"... ABSTRACT In this paper, following the Compressed Sensing paradigm, we study the problem of recovering sparse or compressible signals from uniformly quantized measurements. We present a new class of convex optimization programs, or decoders, coined Basis Pursuit DeQuantizer of moment p (BPDQ p ), th ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
ABSTRACT In this paper, following the Compressed Sensing paradigm, we study the problem of recovering sparse or compressible signals from uniformly quantized measurements. We present a new class of convex optimization programs, or decoders, coined Basis Pursuit DeQuantizer of moment p (BPDQ p ), that model the quantization distortion more faithfully than the commonly used Basis Pursuit DeNoise (BPDN) program. Our decoders proceed by minimizing the sparsity of the signal to be reconstructed while enforcing a data fidelity term of bounded p norm, for 2 < p ∞. We show that in oversampled situations the performance of the BPDQ p decoders are significantly better than that of BPDN, with reconstruction error due to quantization divided by √ p + 1. This reduction relies on a modified Restricted Isometry Property of the sensing matrix expressed in the pnorm (RIP p ); a property satisfied by Gaussian random matrices with high probability. We conclude with numerical experiments comparing BPDQ p and BPDN for signal and image reconstruction problems.
BOn the difficulty of nearest neighbor search,[ Int. Conf
 Online]. Available
, 2012
"... Fast approximate nearest neighbor (NN) searchinlargedatabasesisbecomingpopular and several powerful learningbased formulations have been proposed recently. However, not much attention has been paid to a more fundamental question: how difficult is (approximate) nearestneighbor searchinagiven data se ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
Fast approximate nearest neighbor (NN) searchinlargedatabasesisbecomingpopular and several powerful learningbased formulations have been proposed recently. However, not much attention has been paid to a more fundamental question: how difficult is (approximate) nearestneighbor searchinagiven data set? And which data properties affect the difficulty of nearest neighbor search and how? Thispaperintroducesthefirstconcrete measure called Relative Contrast that can be used to evaluate the influence of several crucial data characteristics such as dimensionality, sparsity, and database size simultaneously in arbitrary normed metric spaces. Moreover, wepresentatheoreticalanalysisto show how relative contrast affects the complexity of Local Sensitive Hashing, a popular approximate NN search method. Relative contrast also provides an explanation for a family of heuristic hashing algorithms with good practical performance based on PCA. Finally, we show that most of the previous works measuring meaningfulness or difficulty ofNNsearchcanbederivedasspecialasymptotic cases for dense vectors of the proposed measure. 1.
M.: A probabilistic approach to nearest neighbor classification: Naive hubness bayesian knearest neighbor
 In: Proceeding of the CIKM conference. (2011
"... Most machinelearning tasks, including classification, involve dealing with highdimensional data. It was recently shown that the phenomenon of hubness, inherent to highdimensional data, can be exploited to improve methods based on nearest neighbors (NNs). Hubness refers to the emergence of points ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Most machinelearning tasks, including classification, involve dealing with highdimensional data. It was recently shown that the phenomenon of hubness, inherent to highdimensional data, can be exploited to improve methods based on nearest neighbors (NNs). Hubness refers to the emergence of points (hubs) that appear among the k NNs of many other points in the data, and constitute influential points for kNN classification. In this paper, we present a new probabilistic approach to kNN classification, naive hubness Bayesian knearest neighbor (NHBNN), which employs hubness for computing class likelihood estimates. Experiments show that NHBNN compares favorably to different variants of the kNN classifier, including probabilistic kNN (PNN) which is often used as an underlying probabilistic framework for NN classification, signifying that NHBNN is a promising alternative framework for developing probabilistic NN algorithms.
A MIREX METAANALYSIS OF HUBNESS IN AUDIO MUSIC SIMILARITY
"... We use results from the 2011 MIREX “Audio Music Similarity and Retrieval ” task for a meta analysis of the hub phenomenon. Hub songs appear similar to an undesirably high number of other songs due to a problem of measuring distances in high dimensional spaces. Comparing 17 algorithms we are able to ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
We use results from the 2011 MIREX “Audio Music Similarity and Retrieval ” task for a meta analysis of the hub phenomenon. Hub songs appear similar to an undesirably high number of other songs due to a problem of measuring distances in high dimensional spaces. Comparing 17 algorithms we are able to confirm that different algorithms produce very different degrees of hubness. We also show that hub songs exhibit less perceptual similarity to the songs they are close to, according to an audio similarity function, than nonhub songs. Application of the recently introduced method of “mutual proximity ” is able to decisively improve this situation. 1.
Facebrowsing: Search and navigation through comparisons
 In ITA workshop
, 2010
"... Abstract—This paper addresses the problem of finding the nearest neighbor (or one of the Rnearest neighbors) of a query object in a database which is only accessible through a comparison oracle. The comparison oracle, given two reference objects and a query object, returns the reference object clos ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract—This paper addresses the problem of finding the nearest neighbor (or one of the Rnearest neighbors) of a query object in a database which is only accessible through a comparison oracle. The comparison oracle, given two reference objects and a query object, returns the reference object closest to the query object. The oracle attempts to model the behavior of human users, capable of making statements about similarity, but not of assigning meaningful numerical values to distances between objects. We develop nearestneighbor search algorithms and analyze its performance for such an oracles. Using such a comparison oracle, the best we can hope for is to obtain, for every object in the database, a ranking of the other objects according to their distance to it. The difficulty of searching using such an oracle depends on the nonhomogeneities of the underlying space. We introduce the new idea of a ranksensitive hash (RSH) function which gives same hash value for “similar ” objects based on the rankvalue of the objects obtained from the similarity oracle. As one application of RSH, we demonstrate that, we can retrieve one of the (1 + ǫ)rnearest neighbor of a query point in timecomplexity depending on an underlying property (termed rankdistortion) of the search space. We use this idea to implement a navigation system for an image database of human faces. In particular, we design a database for images that is organized adaptively based on both baseline comparisons using eigenfaces and refined using selected human input. We present a preliminary implementation of this system which seeks to minimize the number of questions asked to a (human) oracle. I.