Results 1 - 10
of
1,222
Nonnegative Sparse PCA with Provable Guarantees
"... We introduce a novel algorithm to compute non-negative sparse principal components of posi-tive semidefinite (PSD) matrices. Our algorithm comes with approximation guarantees contingent on the spectral profile of the input matrix A: the sharper the eigenvalue decay, the better the qual-ity of the ap ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
We introduce a novel algorithm to compute non-negative sparse principal components of posi-tive semidefinite (PSD) matrices. Our algorithm comes with approximation guarantees contingent on the spectral profile of the input matrix A: the sharper the eigenvalue decay, the better the qual
Efficient Distributed Topic Modeling with Provable Guarantees
"... Topic modeling for large-scale distributed web-collections requires distributed tech-niques that account for both computational and communication costs. We consider topic modeling under the separability assumption and develop novel computationally efficient methods that provably achieve the statisti ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
Topic modeling for large-scale distributed web-collections requires distributed tech-niques that account for both computational and communication costs. We consider topic modeling under the separability assumption and develop novel computationally efficient methods that provably achieve
Feature selection for linear SVM with provable guarantees,”
- Journal of Machine Learning Research,
, 2015
"... Abstract We give two provably accurate featureselection techniques for the linear SVM. The algorithms run in deterministic and randomized time respectively. Our algorithms can be used in an unsupervised or supervised setting. The supervised approach is based on sampling features from support vector ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Abstract We give two provably accurate featureselection techniques for the linear SVM. The algorithms run in deterministic and randomized time respectively. Our algorithms can be used in an unsupervised or supervised setting. The supervised approach is based on sampling features from support
Gapped Local Similarity Search with Provable Guarantees
"... 1 Introduction Problem and Significance. A gapped local similarity between two sequencesis a pair of fixed length substrings, one from each sequence, that align with few mismatches and indels (insertions/deletions). We address the problem offinding all such similarities between two sequences. This i ..."
Abstract
- Add to MetaCart
1 Introduction Problem and Significance. A gapped local similarity between two sequencesis a pair of fixed length substrings, one from each sequence, that align with few mismatches and indels (insertions/deletions). We address the problem offinding all such similarities between two sequences. This is a core problem in bio-sequence similarity search, as it is a variant of the basic local alignmentproblem [29] with edit distance as the scoring function. Edit distance is simpler than a general scoring function as it treats mismatches and indels via unitcosts, nevertheless it is important and very relevant for comparing genomic DNA sequences as discussed next.
High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity
, 2011
"... Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependencies. We study these issues in the context of high-dimensional sparse linear regression, and ..."
Abstract
-
Cited by 75 (10 self)
- Add to MetaCart
, and propose novel estimators for the cases of noisy, missing, and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently non-convex, and it is difficult to establish theoretical guarantees on practical
An Approach for Selecting Tests on Extended Finite State Machines with Provable Guarantees
"... Abstract. Building high confidence regression test suites to validate the changes performed during system evolution and maintenance is a challenging problem. This paper describes a formal approach that selects every test from a given test suite guaranteed to exercise a given change and discards othe ..."
Abstract
- Add to MetaCart
Abstract. Building high confidence regression test suites to validate the changes performed during system evolution and maintenance is a challenging problem. This paper describes a formal approach that selects every test from a given test suite guaranteed to exercise a given change and discards
Test Selection on Extended Finite State Machines with Provable Guarantees
"... License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Building high confidence regression test suites to validate new system versions is a challenging problem. A modelbased approach to build a regression test suite from a ..."
Abstract
- Add to MetaCart
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Building high confidence regression test suites to validate new system versions is a challenging problem. A modelbased approach to build a regression test suite from a given test suite is described. The generated test suite includes every test that will traverse a change performed to produce the new version, and consists of only such tests to reduce the testing costs. Finite state machines extended with typed variables (EFSMs) are used to model systems and system changes are mapped to EFSM transition changes adding/deleting/replacing EFSM transitions and states. Tests are a sequence of input and expected output messages with concrete parameter values over the supported data types. An invariant is formulated to characterize tests whose runtime behavior can be accurately predicted by analyzing their descriptions along with the model. Incremental procedures to efficiently evaluate the invariant and to select tests for regression are developed. Overlaps among the test descriptions are exploited to extend the approach to simultaneously select multiple tests to reduce the test selection costs. Effectiveness of the approach is demonstrated by applying it to several protocols, Web services, and model programs extracted from a popular testing benchmark. Our experimental results show that the proposed approach is economical for regression test selection in all these examples. For all these examples, the proposed approach is able to identify all tests exercising changes more efficiently than brute-force symbolic evaluation.
Wide-area cooperative storage with CFS
, 2001
"... The Cooperative File System (CFS) is a new peer-to-peer readonly storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers pr ..."
Abstract
-
Cited by 999 (53 self)
- Add to MetaCart
The Cooperative File System (CFS) is a new peer-to-peer readonly storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers
Maximizing the Spread of Influence Through a Social Network
- In KDD
, 2003
"... Models for the processes by which ideas and influence propagate through a social network have been studied in a number of domains, including the diffusion of medical and technological innovations, the sudden and widespread adoption of various strategies in game-theoretic settings, and the effects of ..."
Abstract
-
Cited by 990 (7 self)
- Add to MetaCart
the first provable approximation guarantees for efficient algorithms. Using an analysis framework based on submodular functions, we show that a natural greedy strategy obtains a solution that is provably within 63 % of optimal for several classes of models; our framework suggests a general approach
Provable Data Possession at Untrusted Stores
, 2007
"... We introduce a model for provable data possession (PDP) that allows a client that has stored data at an untrusted server to verify that the server possesses the original data without retrieving it. The model generates probabilistic proofs of possession by sampling random sets of blocks from the serv ..."
Abstract
-
Cited by 302 (9 self)
- Add to MetaCart
in widely-distributed storage systems. We present two provably-secure PDP schemes that are more efficient than previous solutions, even when compared with schemes that achieve weaker guarantees. In particular, the overhead at the server is low (or even constant), as opposed to linear in the size of the data
Results 1 - 10
of
1,222