Results 1  10
of
1,348
Composable memory transactions
 In Symposium on Principles and Practice of Parallel Programming (PPoPP
, 2005
"... Atomic blocks allow programmers to delimit sections of code as ‘atomic’, leaving the language’s implementation to enforce atomicity. Existing work has shown how to implement atomic blocks over wordbased transactional memory that provides scalable multiprocessor performance without requiring changes ..."
Abstract

Cited by 509 (43 self)
 Add to MetaCart
repeatedly in an atomic block), (3) we use runtime filtering to detect duplicate log entries that are missed statically, and (4) we present a series of GCtime techniques to compact the logs generated by longrunning atomic blocks. Our implementation supports shortrunning scalable concurrent benchmarks
Duplicate Record Detection: A Survey
, 2007
"... Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task. Errors are introduced as the result of transcription errors, incomplete information, lack of standa ..."
Abstract

Cited by 448 (11 self)
 Add to MetaCart
of standard formats, or any combination of these factors. In this paper, we present a thorough analysis of the literature on duplicate record detection. We cover similarity metrics that are commonly used to detect similar field entries, and we present an extensive set of duplicate detection algorithms
Exact Matrix Completion via Convex Optimization
, 2008
"... We consider a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. Suppose that we observe m entries selected uniformly at random from a matrix M. Can we complete the matrix and recover the entries that we have not seen? We show that one can perfe ..."
Abstract

Cited by 873 (26 self)
 Add to MetaCart
perfectly recover most lowrank matrices from what appears to be an incomplete set of entries. We prove that if the number m of sampled entries obeys m ≥ C n 1.2 r log n for some positive numerical constant C, then with very high probability, most n × n matrices of rank r can be perfectly recovered
The Power of Convex Relaxation: NearOptimal Matrix Completion
, 2009
"... This paper is concerned with the problem of recovering an unknown matrix from a small fraction of its entries. This is known as the matrix completion problem, and comes up in a great number of applications, including the famous Netflix Prize and other similar questions in collaborative filtering. In ..."
Abstract

Cited by 359 (7 self)
 Add to MetaCart
is on the order of the information theoretic limit (up to logarithmic factors). This convex program simply finds, among all matrices consistent with the observed entries, that with minimum nuclear norm. As an example, we show that on the order of nr log(n) samples are needed to recover a random n × n matrix
Analysis of a very large AltaVista query log
, 1998
"... In this paper we present an analysis of a 280 GB AltaVista Search Engine query log consisting of approximately 1 billion entries for search requests over a period of six weeks. This represents approximately 285 million user sessions, each an attempt to fill a single information need. We present an a ..."
Abstract

Cited by 195 (2 self)
 Add to MetaCart
an analysis of individual queries, query duplication, and query sessions. Furthermore we present results of a correlation analysis of the log entries, studying the interaction of terms within queries. Our data supports the conjecture that web users differ significantly from the user assumed in the standard
An Efficient DomainIndependent Algorithm for Detecting Approximately Duplicate Database Records
, 1997
"... Detecting database records that are approximate duplicates, but not exact duplicates, is an important task. Databases may contain duplicate records concerning the same realworld entity because of data entry errors, because of unstandardized abbreviations, or because of differences in the detailed sc ..."
Abstract

Cited by 215 (2 self)
 Add to MetaCart
Detecting database records that are approximate duplicates, but not exact duplicates, is an important task. Databases may contain duplicate records concerning the same realworld entity because of data entry errors, because of unstandardized abbreviations, or because of differences in the detailed
Matrix Completion with Noise
"... On the heels of compressed sensing, a remarkable new field has very recently emerged. This field addresses a broad range of problems of significant practical interest, namely, the recovery of a data matrix from what appears to be incomplete, and perhaps even corrupted, information. In its simplest ..."
Abstract

Cited by 255 (13 self)
 Add to MetaCart
that matrix completion is provably accurate even when the few observed entries are corrupted with a small amount of noise. A typical result is that one can recover an unknown n × n matrix of low rank r from just about nr log 2 n noisy samples with an error which is proportional to the noise level. We present
Realworld Data is Dirty: Data Cleansing and The Merge/Purge Problem
 DATA MINING AND KNOWLEDGE DISCOVERY
, 1998
"... The problem of merging multiple databases of information about common entities is frequently encountered in KDD and decision support applications in large commercial and government organizations. The problem we study is often called the Merge/Purge problem and is difficult to solve both in scale and ..."
Abstract

Cited by 250 (0 self)
 Add to MetaCart
and accuracy. Large repositories of data typically have numerous duplicate information entries about the same entities that are difficult to cull together without an intelligent "equational theory" that identifies equivalent items by a complex, domaindependent matching process. We have developed a
Backtracking intrusions
, 2003
"... Analyzing intrusions today is an arduous, largely manual task because system administrators lack the information and tools needed to understand easily the sequence of steps that occurred in an attack. The goal of BackTracker is to identify automatically potential sequences of steps that occurred in ..."
Abstract

Cited by 240 (11 self)
 Add to MetaCart
as honeypots. In each case, BackTracker is able to highlight effectively the entry point used to gain access to the system and the sequence of steps from that entry point to the point at which we noticed the intrusion. The logging required to support BackTracker added 9 % overhead in running time and generated
Sparsity and Incoherence in Compressive Sampling
, 2006
"... We consider the problem of reconstructing a sparse signal x 0 ∈ R n from a limited number of linear measurements. Given m randomly selected samples of Ux 0, where U is an orthonormal matrix, we show that ℓ1 minimization recovers x 0 exactly when the number of measurements exceeds m ≥ Const · µ 2 (U) ..."
Abstract

Cited by 238 (13 self)
 Add to MetaCart
) · S · log n, where S is the number of nonzero components in x 0, and µ is the largest entry in U properly normalized: µ(U) = √ n · maxk,j Uk,j. The smaller µ, the fewer samples needed. The result holds for “most ” sparse signals x 0 supported on a fixed (but arbitrary) set T. Given T, if the sign of x 0
Results 1  10
of
1,348