Results 1  10
of
12
Nearoptimal hashing algorithms for approximate nearest neighbor in high dimensions
, 2008
"... In this article, we give an overview of efficient algorithms for the approximate and exact nearest neighbor problem. The goal is to preprocess a dataset of objects (e.g., images) so that later, given a new query object, one can quickly return the dataset object that is most similar to the query. The ..."
Abstract

Cited by 457 (7 self)
 Add to MetaCart
In this article, we give an overview of efficient algorithms for the approximate and exact nearest neighbor problem. The goal is to preprocess a dataset of objects (e.g., images) so that later, given a new query object, one can quickly return the dataset object that is most similar to the query. The problem is of significant interest in a wide variety of areas.
Mesh Connected Computers with Fixed and Reconfigurable Buses: Packet Routing, Sorting, and Selection
 PROC. FIRST ANNUAL EUROPEAN SYMPOSIUM ON ALGORITHMS, SPRINGERVERLAG LECTURE NOTES IN COMPUTER SCIENCE 726
, 1993
"... Mesh connected computers have become attractive models of computing because of their varied special features. In this paper we consider two variations of the mesh model: 1) a mesh with fixed buses, and 2) a mesh with reconfigurable buses. Both these models have been the subject matter of extensive ..."
Abstract

Cited by 26 (9 self)
 Add to MetaCart
(Show Context)
Mesh connected computers have become attractive models of computing because of their varied special features. In this paper we consider two variations of the mesh model: 1) a mesh with fixed buses, and 2) a mesh with reconfigurable buses. Both these models have been the subject matter of extensive previous research. We solve numerous important problems related to packet routing and sorting on these models. In particular, we provide lower bounds and very nearly matching upper bounds for the following problems on both these models: 1) Routing on a linear array; and 2) k − k routing and k − k sorting on a 2D mesh for any k ≥ 12. We provide an improved algorithm for 1 − 1 routing and a matching sorting algorithm. In addition we present greedy algorithms for 1 − 1 routing, k − k routing, and k − k sorting that are better on average and supply matching lower bounds. We also show that sorting can be performed
Approximate Nearest Neighbor: Towards Removing the Curse of Dimensionality
, 2012
"... We present two algorithms for the approximate nearest neighbor problem in highdimensional spaces. For data sets of size n living in Rd, the algorithms require space that is only polynomial in n and d, while achieving query times that are sublinear in n and polynomial in d. We also show application ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
We present two algorithms for the approximate nearest neighbor problem in highdimensional spaces. For data sets of size n living in Rd, the algorithms require space that is only polynomial in n and d, while achieving query times that are sublinear in n and polynomial in d. We also show applications to other highdimensional geometric problems, such as the approximate minimum spanning tree. The article is based on the material from the authors’ STOC’98 and FOCS’01 papers. It unifies, generalizes and simplifies the results from those papers.
Beyond Locality–Sensitive Hashing
"... We present a new data structure for the c–approximate near neighbor problem (ANN) in the Euclidean space. For n points in Rd, our algorithm achieves Oc(dnρ) query time and Oc(n1+ρ + nd) space, where ρ ≤ 7/(8c2) + O(1/c3) + oc(1). This is the first improvement over the result by Andoni and Indyk (FOC ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
We present a new data structure for the c–approximate near neighbor problem (ANN) in the Euclidean space. For n points in Rd, our algorithm achieves Oc(dnρ) query time and Oc(n1+ρ + nd) space, where ρ ≤ 7/(8c2) + O(1/c3) + oc(1). This is the first improvement over the result by Andoni and Indyk (FOCS 2006) and the first data structure that bypasses a locality–sensitive hashing lower bound proved by O’Donnell, Wu and Zhou (ITCS 2011). By a standard reduction we obtain a data structure for the Hamming space and ℓ1 norm with ρ ≤ 7/(8c) + O(1/c3/2) + oc(1), which is the first improvement over the result of Indyk and Motwani (STOC 1998). 1
Running Title: The Light Bulb Problem Corresponding Author:
"... In this paper, we consider the problem of correlational learning and present algorithms to determine correlated objects. 3 1. ..."
Abstract
 Add to MetaCart
In this paper, we consider the problem of correlational learning and present algorithms to determine correlated objects. 3 1.
Computational Animal Theory: a CorrelationDetection Task
"... Introduction Through the process of unsupervised inductive learning, humans acquire information about the world without the assistance of a teacher. In order for learning of this type to take place, we must have an ability to detect correlations, i.e. to notice sets of attributes that frequently oc ..."
Abstract
 Add to MetaCart
Introduction Through the process of unsupervised inductive learning, humans acquire information about the world without the assistance of a teacher. In order for learning of this type to take place, we must have an ability to detect correlations, i.e. to notice sets of attributes that frequently occur together. Clearly, humans possess this ability to some degree. For instance, we would probably be quick to notice that some friend is always late; we might also notice that it rains on every Wednesday. Of course, there must be some limits to this ability: in a roomful of people, we might not notice that two people were both wearing purple socks. This hypothetical ability creates certain problems for L.G. Valiant's theory of cognition [6]: in particular, he requires the existence of an "attentional mechanism" (AM) which filters the world around us and only allows us to "notice" a small number of attributes of any given scene. If something functionally equivalent to the AM exists,
The Light Bulb Problem
, 1989
"... In this paper, we consider the problem of correlational learning and present algorithms to determine correlated objects. 3 1. INTRODUCTION Correlational learning, a subclass of unsupervised learning, aims to identify statistically correlated groups of attributes. In this paper, we consider the foll ..."
Abstract
 Add to MetaCart
In this paper, we consider the problem of correlational learning and present algorithms to determine correlated objects. 3 1. INTRODUCTION Correlational learning, a subclass of unsupervised learning, aims to identify statistically correlated groups of attributes. In this paper, we consider the following correlational learning problem due to L. G. Valiant, 1985 and 1988: We have a sequence of n random light bulbs each of which is either on or o# with equal probability at each time step. Further, we know that a certain pair of bulbs is positively correlated. The problem is to find e#cient algorithms for recognizing the unique pair of light bulbs with the maximum correlation. Some preliminary results in this direction are reported in Paturi, 1988. In this paper, we consider a more general version of the basic light bulb problem. In the general version, we assume that the behavior of the bulbs is governed by some unknown probability distribution except that the pair with the largest pair...
Compressed Matrix Multiplication
"... We present a simple algorithm that approximates the product of nbyn real matrices A and B. Let ABF denote the Frobenius norm of AB, and b be a parameter determining the time/accuracy tradeoff. Given 2wise independent hash functions h1, h2: [n] → [b], and s1, s2: [n] → {−1, +1} the algorith ..."
Abstract
 Add to MetaCart
We present a simple algorithm that approximates the product of nbyn real matrices A and B. Let ABF denote the Frobenius norm of AB, and b be a parameter determining the time/accuracy tradeoff. Given 2wise independent hash functions h1, h2: [n] → [b], and s1, s2: [n] → {−1, +1} the algorithm works by first “compressing ” the matrix product into the polynomial n∑ n∑ p(x) = Aiks1(i) x h1(i) n∑ ⎝ Bkjs2(j) x h2(j) k=1 i=1 Using the fast Fourier transform to compute polynomial multiplication, we can compute c0,..., cb−1 such that ∑ i cixi = (p(x) mod xb) + (p(x) div xb) in time Õ(n2 + nb). An unbiased estimator of (AB)ij with variance at most AB  2 F /b can then be computed as: j=1 Cij = s1(i) s2(j) c (h1(i)+h2(j)) mod b. Our approach also leads to an algorithm for computing AB exactly, with high probability, in time Õ(N + nb) in the case where A and B have at most N nonzero entries, and AB has at most b nonzero entries.
MPI for Intelligent Systems
"... Genomewide association studies (GWAS) have not been able to discover strong associations between many complex human diseases and single genetic loci. Mapping these phenotypes to pairs of genetic loci is hindered by the huge number of candidates leading to enormous computational and statistical pr ..."
Abstract
 Add to MetaCart
(Show Context)
Genomewide association studies (GWAS) have not been able to discover strong associations between many complex human diseases and single genetic loci. Mapping these phenotypes to pairs of genetic loci is hindered by the huge number of candidates leading to enormous computational and statistical problems. In GWAS on single nucleotide polymorphisms (SNPs), one has to consider in the order of 1010 to 1014 pairs, which is infeasible in practice. In this article, we give the first algorithm for 2locus genomewide association studies that is subquadratic in the number, n, of SNPs. The running time of our algorithm is datadependent, but large experiments over real genomic data suggest that it scales empirically as n3/2. As a result, our algorithm can easily cope with n ∼ 107, i.e., it can efficiently search all pairs of SNPs in the human genome.