Results 1  10
of
15,980
Efficient exact setsimilarity joins
 in Proc. of the 32nd Intl. Conf. on Very Large Data Bases
, 2006
"... Given two input collections of sets, a setsimilarity join (SSJoin) identifies all pairs of sets, one from each collection, that have high similarity. Recent work has identified SSJoin as a useful primitive operator in data cleaning. In this paper, we propose new algorithms for SSJoin. Our algorithm ..."
Abstract

Cited by 142 (7 self)
 Add to MetaCart
Given two input collections of sets, a setsimilarity join (SSJoin) identifies all pairs of sets, one from each collection, that have high similarity. Recent work has identified SSJoin as a useful primitive operator in data cleaning. In this paper, we propose new algorithms for SSJoin. Our
Exact Matrix Completion via Convex Optimization
, 2008
"... We consider a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. Suppose that we observe m entries selected uniformly at random from a matrix M. Can we complete the matrix and recover the entries that we have not seen? We show that one can perfe ..."
Abstract

Cited by 873 (26 self)
 Add to MetaCart
perfectly recover most lowrank matrices from what appears to be an incomplete set of entries. We prove that if the number m of sampled entries obeys m ≥ C n 1.2 r log n for some positive numerical constant C, then with very high probability, most n × n matrices of rank r can be perfectly recovered
Exact Sampling with Coupled Markov Chains and Applications to Statistical Mechanics
, 1996
"... For many applications it is useful to sample from a finite set of objects in accordance with some particular distribution. One approach is to run an ergodic (i.e., irreducible aperiodic) Markov chain whose stationary distribution is the desired distribution on this set; after the Markov chain has ..."
Abstract

Cited by 543 (13 self)
 Add to MetaCart
For many applications it is useful to sample from a finite set of objects in accordance with some particular distribution. One approach is to run an ergodic (i.e., irreducible aperiodic) Markov chain whose stationary distribution is the desired distribution on this set; after the Markov chain
Robust Uncertainty Principles: Exact Signal Reconstruction From Highly Incomplete Frequency Information
, 2006
"... This paper considers the model problem of reconstructing an object from incomplete frequency samples. Consider a discretetime signal and a randomly chosen set of frequencies. Is it possible to reconstruct from the partial knowledge of its Fourier coefficients on the set? A typical result of this pa ..."
Abstract

Cited by 2632 (50 self)
 Add to MetaCart
This paper considers the model problem of reconstructing an object from incomplete frequency samples. Consider a discretetime signal and a randomly chosen set of frequencies. Is it possible to reconstruct from the partial knowledge of its Fourier coefficients on the set? A typical result
Leveraging Set Relations in Exact Set Similarity Join
"... ABSTRACT Exact set similarity join, which finds all the similar set pairs from two collections of sets, is a fundamental problem with a wide range of applications. The existing solutions for set similarity join follow a filteringverification framework, which generates a list of candidate pairs thr ..."
Abstract
 Add to MetaCart
ABSTRACT Exact set similarity join, which finds all the similar set pairs from two collections of sets, is a fundamental problem with a wide range of applications. The existing solutions for set similarity join follow a filteringverification framework, which generates a list of candidate pairs
The information bottleneck method
, 1999
"... We define the relevant information in a signal x ∈ X as being the information that this signal provides about another signal y ∈ Y. Examples include the information that face images provide about the names of the people portrayed, or the information that speech sounds provide about the words spoken. ..."
Abstract

Cited by 540 (35 self)
 Add to MetaCart
about Y through a ‘bottleneck ’ formed by a limited set of codewords ˜X. This constrained optimization problem can be seen as a generalization of rate distortion theory in which the distortion measure d(x, ˜x) emerges from the joint statistics of X and Y. This approach yields an exact set of self
The Quickhull algorithm for convex hulls
 ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE
, 1996
"... The convex hull of a set of points is the smallest convex set that contains the points. This article presents a practical convex hull algorithm that combines the twodimensional Quickhull Algorithm with the generaldimension BeneathBeyond Algorithm. It is similar to the randomized, incremental algo ..."
Abstract

Cited by 713 (0 self)
 Add to MetaCart
is implemented with floatingpoint arithmetic, this assumption can lead to serious errors. We briefly describe a solution to this problem when computing the convex hull in two, three, or four dimensions. The output is a set of “thick ” facets that contain all possible exact convex hulls of the input. A variation
Construction of abstract state graphs with PVS
, 1997
"... We describe in this paper a method based on abstract interpretation which, from a theoretical point of view, is similar to the splitting methods proposed in [DGG93, Dam96] but the weaker abstract transition relation we use, allows us to construct automatically abstract state graphs paying a reasonab ..."
Abstract

Cited by 742 (10 self)
 Add to MetaCart
reasonable price. We consider a particular set of abstract states: the set of the monomials on a set of state predicates ' 1 ; :::; ' ` . The successor of an abstract state m for a transition ø of the program is the least monomial satisfied by all successors via ø of concrete states satisfying m
Localitysensitive hashing scheme based on pstable distributions
 In SCG ’04: Proceedings of the twentieth annual symposium on Computational geometry
, 2004
"... inÇÐÓ�Ò We present a novel LocalitySensitive Hashing scheme for the Approximate Nearest Neighbor Problem underÐÔnorm, based onÔstable distributions. Our scheme improves the running time of the earlier algorithm for the case of theÐnorm. It also yields the first known provably efficient approximate ..."
Abstract

Cited by 521 (8 self)
 Add to MetaCart
NN algorithm for the caseÔ�. We also show that the algorithm finds the exact near neigbhor time for data satisfying certain “bounded growth ” condition. Unlike earlier schemes, our LSH scheme works directly on points in the Euclidean space without embeddings. Consequently, the resulting query time
Stable signal recovery from incomplete and inaccurate measurements,”
 Comm. Pure Appl. Math.,
, 2006
"... Abstract Suppose we wish to recover a vector x 0 ∈ R m (e.g., a digital signal or image) from incomplete and contaminated observations y = Ax 0 + e; A is an n × m matrix with far fewer rows than columns (n m) and e is an error term. Is it possible to recover x 0 accurately based on the data y? To r ..."
Abstract

Cited by 1397 (38 self)
 Add to MetaCart
for almost any set of n coefficients provided that the number of nonzeros is of the order of n/(log m) 6 . In the case where the error term vanishes, the recovery is of course exact, and this work actually provides novel insights into the exact recovery phenomenon discussed in earlier papers. The methodology
Results 1  10
of
15,980