Results 1 
6 of
6
Nearoptimal hashing algorithms for approximate nearest neighbor in high dimensions
, 2008
"... In this article, we give an overview of efficient algorithms for the approximate and exact nearest neighbor problem. The goal is to preprocess a dataset of objects (e.g., images) so that later, given a new query object, one can quickly return the dataset object that is most similar to the query. The ..."
Abstract

Cited by 457 (7 self)
 Add to MetaCart
In this article, we give an overview of efficient algorithms for the approximate and exact nearest neighbor problem. The goal is to preprocess a dataset of objects (e.g., images) so that later, given a new query object, one can quickly return the dataset object that is most similar to the query. The problem is of significant interest in a wide variety of areas.
Detecting code clones in binary executables
 In ISSTA09 submitted
, 2009
"... Large software projects contain significant code duplication, mainly due to copying and pasting code. Many techniques have been developed to identify duplicated code to enable applications such as refactoring, detecting bugs, and protecting intellectual property. Because source code is often unavai ..."
Abstract

Cited by 23 (0 self)
 Add to MetaCart
(Show Context)
Large software projects contain significant code duplication, mainly due to copying and pasting code. Many techniques have been developed to identify duplicated code to enable applications such as refactoring, detecting bugs, and protecting intellectual property. Because source code is often unavailable, especially for thirdparty software, finding duplicated code in binaries becomes particularly important. However, existing techniques operate primarily on source code, and no effective tool exists for binaries. In this paper, we describe the first practical clone detection algorithm for binary executables. Our algorithm extends an existing tree similarity framework based on clustering of characteristic vectors of labeled trees with novel techniques to normalize assembly instructions and to accurately and compactly model their structural information. We have implemented our technique and evaluated it on Windows XP system binaries totaling over 50 million assembly instructions. Results show that it is both scalable and precise: it analyzed Windows XP system binaries in a few hours and produced few false positives. We believe our technique is a practical, enabling technology for many applications dealing with binary code.
Accelerating feature based registration using the JohnsonLindenstrauss lemma
 in: Medical Image Computing and ComputerAssisted Intervention (MICCAI 2009), volume 5761 of Lecture Notes in Computer Science
, 2009
"... Abstract. We introduce an efficient search strategy to substantially accelerate feature based registration. Previous feature based registration algorithms often use truncated search strategies in order to achieve small computation times. Our new accelerated search strategy is based on the realizatio ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
(Show Context)
Abstract. We introduce an efficient search strategy to substantially accelerate feature based registration. Previous feature based registration algorithms often use truncated search strategies in order to achieve small computation times. Our new accelerated search strategy is based on the realization that the search for corresponding features can be dramatically accelerated by utilizing JohnsonLindenstrauss dimension reduction. Order of magnitude calculations for the search strategy we propose here indicate that the algorithm proposed is more than a million times faster than previously utilized naive search strategies, and this advantage in speed is directly translated into an advantage in accuracy as the fast speed enables more comparisons to be made in the same amount of time. We describe the accelerated scheme together with a full complexity analysis. The registration algorithm was applied to large transmission electron microscopy (TEM) images of neural ultrastructure. Our experiments demonstrate that our algorithm enables alignment of TEM images with increased accuracy and efficiency compared to previous algorithms. 1
New LSHbased algorithm for approximate nearest neighbor
, 2005
"... We present an algorithm for capproximate nearest neighbor problem in a ddimensional Euclidean space, achieving query time of O(dn1/c 2+o(1)) and space O(dn+n1+1/c 2+o(1)). ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We present an algorithm for capproximate nearest neighbor problem in a ddimensional Euclidean space, achieving query time of O(dn1/c 2+o(1)) and space O(dn+n1+1/c 2+o(1)).
unknown title
"... 4 Localitysensitive hashing using stable distributions 4.1 The LSH scheme based on sstable distributions In this chapter, we introduce and analyze a novel localitysensitive hashing family. The family is defined for the case where the distances are measured according to the ls norm, for any s ∈ [0 ..."
Abstract
 Add to MetaCart
(Show Context)
4 Localitysensitive hashing using stable distributions 4.1 The LSH scheme based on sstable distributions In this chapter, we introduce and analyze a novel localitysensitive hashing family. The family is defined for the case where the distances are measured according to the ls norm, for any s ∈ [0, 2]. The hash functions are particularly simple for the case s = 2, i.e., the Euclidean norm. The new family provides an efficient solution to (approximate or exact) randomized near neighbor problem. Part of this work appeared earlier in [DIIM04]. 4.1.1 sstable distributions Stable distributions [Zol86] are defined as limits of normalized sums of independent identically distributed variables (an alternate definition follows). The most wellknown example of a stable distribution is Gaussian (or normal) distribution. However, the class is much wider; for example, it includes heavytailed distributions. Definition 4.1 A distribution D over ℜ is called sstable, if there exists p ≥ 0 such that for any n real numbers v1...vn and i.i.d. variables X1...Xn with distribution D, the random variable ∑ i viXi has the same distribution as the variable ( ∑ i vi  p) 1/pX, where X is a random variable with distribution D. It is known [Zol86] that stable distributions exist for any p ∈ (0,2]. In particular: a Cauchy distribution DC, defined by the density function c(x) = 1 1 π 1+x2, is 1stable a Gaussian (normal) distribution DG, defined by the density function g(x) =