Results 1  10
of
2,279,466
Efficient exact setsimilarity joins
 in Proc. of the 32nd Intl. Conf. on Very Large Data Bases
, 2006
"... Given two input collections of sets, a setsimilarity join (SSJoin) identifies all pairs of sets, one from each collection, that have high similarity. Recent work has identified SSJoin as a useful primitive operator in data cleaning. In this paper, we propose new algorithms for SSJoin. Our algorithm ..."
Abstract

Cited by 133 (7 self)
 Add to MetaCart
Given two input collections of sets, a setsimilarity join (SSJoin) identifies all pairs of sets, one from each collection, that have high similarity. Recent work has identified SSJoin as a useful primitive operator in data cleaning. In this paper, we propose new algorithms for SSJoin. Our
Exact Matrix Completion via Convex Optimization
, 2008
"... We consider a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. Suppose that we observe m entries selected uniformly at random from a matrix M. Can we complete the matrix and recover the entries that we have not seen? We show that one can perfe ..."
Abstract

Cited by 860 (27 self)
 Add to MetaCart
perfectly recover most lowrank matrices from what appears to be an incomplete set of entries. We prove that if the number m of sampled entries obeys m ≥ C n 1.2 r log n for some positive numerical constant C, then with very high probability, most n × n matrices of rank r can be perfectly recovered
Exact Sampling with Coupled Markov Chains and Applications to Statistical Mechanics
, 1996
"... For many applications it is useful to sample from a finite set of objects in accordance with some particular distribution. One approach is to run an ergodic (i.e., irreducible aperiodic) Markov chain whose stationary distribution is the desired distribution on this set; after the Markov chain has ..."
Abstract

Cited by 548 (13 self)
 Add to MetaCart
For many applications it is useful to sample from a finite set of objects in accordance with some particular distribution. One approach is to run an ergodic (i.e., irreducible aperiodic) Markov chain whose stationary distribution is the desired distribution on this set; after the Markov chain
KodairaSpencer theory of gravity and exact results for quantum string amplitudes
 Commun. Math. Phys
, 1994
"... We develop techniques to compute higher loop string amplitudes for twisted N = 2 theories with ĉ = 3 (i.e. the critical case). An important ingredient is the discovery of an anomaly at every genus in decoupling of BRST trivial states, captured to all orders by a master anomaly equation. In a particu ..."
Abstract

Cited by 545 (60 self)
 Add to MetaCart
We develop techniques to compute higher loop string amplitudes for twisted N = 2 theories with ĉ = 3 (i.e. the critical case). An important ingredient is the discovery of an anomaly at every genus in decoupling of BRST trivial states, captured to all orders by a master anomaly equation. In a particular realization of the N = 2 theories, the resulting string field theory is equivalent to a topological theory in six dimensions, the Kodaira– Spencer theory, which may be viewed as the closed string analog of the Chern–Simon theory. Using the mirror map this leads to computation of the ‘number ’ of holomorphic curves of higher genus curves in Calabi–Yau manifolds. It is shown that topological amplitudes can also be reinterpreted as computing corrections to superpotential terms appearing in the effective 4d theory resulting from compactification of standard 10d superstrings on the corresponding N = 2 theory. Relations with c = 1 strings are also pointed out.
Robust Uncertainty Principles: Exact Signal Reconstruction From Highly Incomplete Frequency Information
, 2006
"... This paper considers the model problem of reconstructing an object from incomplete frequency samples. Consider a discretetime signal and a randomly chosen set of frequencies. Is it possible to reconstruct from the partial knowledge of its Fourier coefficients on the set? A typical result of this pa ..."
Abstract

Cited by 2599 (51 self)
 Add to MetaCart
This paper considers the model problem of reconstructing an object from incomplete frequency samples. Consider a discretetime signal and a randomly chosen set of frequencies. Is it possible to reconstruct from the partial knowledge of its Fourier coefficients on the set? A typical result
A Threshold of ln n for Approximating Set Cover
 JOURNAL OF THE ACM
, 1998
"... Given a collection F of subsets of S = f1; : : : ; ng, set cover is the problem of selecting as few as possible subsets from F such that their union covers S, and max kcover is the problem of selecting k subsets from F such that their union has maximum cardinality. Both these problems are NPhar ..."
Abstract

Cited by 778 (5 self)
 Add to MetaCart
Given a collection F of subsets of S = f1; : : : ; ng, set cover is the problem of selecting as few as possible subsets from F such that their union covers S, and max kcover is the problem of selecting k subsets from F such that their union has maximum cardinality. Both these problems are NP
Mining Association Rules between Sets of Items in Large Databases
 IN: PROCEEDINGS OF THE 1993 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, WASHINGTON DC (USA
, 1993
"... We are given a large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. We present an efficient algorithm that generates all significant association rules between items in the database. The algorithm incorporates buffer management and novel esti ..."
Abstract

Cited by 3260 (17 self)
 Add to MetaCart
We are given a large database of customer transactions. Each transaction consists of items purchased by a customer in a visit. We present an efficient algorithm that generates all significant association rules between items in the database. The algorithm incorporates buffer management and novel estimation and pruning techniques. We also present results of applying this algorithm to sales data obtained from a large retailing company, which shows the effectiveness of the algorithm.
The information bottleneck method
 University of Illinois
, 1999
"... We define the relevant information in a signal x ∈ X as being the information that this signal provides about another signal y ∈ Y. Examples include the information that face images provide about the names of the people portrayed, or the information that speech sounds provide about the words spoken. ..."
Abstract

Cited by 545 (38 self)
 Add to MetaCart
about Y through a ‘bottleneck ’ formed by a limited set of codewords ˜X. This constrained optimization problem can be seen as a generalization of rate distortion theory in which the distortion measure d(x, ˜x) emerges from the joint statistics of X and Y. This approach yields an exact set of self
A standardized set of 260 pictures: Norms for name agreement, image agreement, familiarity, and visual complexity
 JOURNAL OF EXPERIMENTAL PSYCHOLOGY: HUMAN LEARNING AND MEMORY
, 1980
"... In this article we present a standardized set of 260 pictures for use in experiments investigating differences and similarities in the processing of pictures and words. The pictures are blackandwhite line drawings executed according to a set of rules that provide consistency of pictorial represent ..."
Abstract

Cited by 615 (1 self)
 Add to MetaCart
In this article we present a standardized set of 260 pictures for use in experiments investigating differences and similarities in the processing of pictures and words. The pictures are blackandwhite line drawings executed according to a set of rules that provide consistency of pictorial
Results 1  10
of
2,279,466