Results 1 
6 of
6
Sketching as a tool for numerical linear algebra
 Foundations and Trends in Theoretical Computer Science
"... ar ..."
(Show Context)
Provably Correct Active Sampling Algorithms for Matrix Column Subset Selection with Missing Data ∗
, 2015
"... We consider the problem of matrix column subset selection, which selects a subset of columns from an input matrix such that the input can be well approximated by the span of the selected columns. Column subset selection has been applied to numerous realworld data applications such as population gen ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
We consider the problem of matrix column subset selection, which selects a subset of columns from an input matrix such that the input can be well approximated by the span of the selected columns. Column subset selection has been applied to numerous realworld data applications such as population genetics summarization, electronic circuits testing and recommendation systems. In many applications the complete data matrix is unavailable and one needs to select representative columns by inspecting only a small portion of the input matrix. In this paper we propose the first provably correct column subset selection algorithms for partially observed data matrices. Our proposed algorithms exhibit different merits and drawbacks in terms of statistical accuracy, computational efficiency, sample complexity and sampling schemes, which provides a nice exploration of the tradeoff between these desired properties for column subset selection. The proposed methods employ the idea of feedback driven sampling and are inspired by several sampling schemes previously introduced for lowrank matrix approximation tasks [DMM08, FKV04, DV06, KS14]. Our analysis shows that two of the proposed algorithms enjoy a relative error bound, which is preferred for column subset selection and matrix approximation purposes. We also demonstrate through both theoretical and empirical analysis the power of feedback driven sampling compared to uniform random sampling on input matrices with highly correlated columns. 1
Reduced model for the crash simulation
"... ABSTRACT The car crash simulation contains of many complexities: the geometrical and material nonlinearities, the contact management and the numerical dispersion. In order to represent well all local physical phenomenon (plasticity, spotweld crack…), a crash model needs a fine mesh (from 5 to 20 M ..."
Abstract
 Add to MetaCart
ABSTRACT The car crash simulation contains of many complexities: the geometrical and material nonlinearities, the contact management and the numerical dispersion. In order to represent well all local physical phenomenon (plasticity, spotweld crack…), a crash model needs a fine mesh (from 5 to 20 M finite elements for a whole vehicle). The crash solver using explicit algorithm takes small time step because of the Courant's condition. Consequently, it takes time for the car crash simulation (till 24 hours), particularly in the context of automotive engineering: we are interested in a reduced model which cuts back the total cost of an optimization study. Our aim does not concern the reduction of a single simulation (of which the dimension is only the product of time and space) but that of parametric domain (product of time, space and design parameters). More precisely, we hope a reduced model not only to reconstruct an already done simulation but to estimate a new simulation interpolated between existing simulation(s) in a Design Of Experiments. Recently, lot of teams have studied reduced models by Reduced Basis, POD CUR decomposition a new algorithm [4
The Nyström Method
, 2014
"... Many data mining and machine learning algorithms involve matrix decomposition, matrix inverse and matrix determinant; and some methods are based on lowrank matrix approximation. The Big Data phenomenon brings new challenges and opportunities to machine learning and data mining. ..."
Abstract
 Add to MetaCart
(Show Context)
Many data mining and machine learning algorithms involve matrix decomposition, matrix inverse and matrix determinant; and some methods are based on lowrank matrix approximation. The Big Data phenomenon brings new challenges and opportunities to machine learning and data mining.
Provable Deterministic Leverage Score Sampling ∗
"... We explain theoretically a curious empirical phenomenon: “Approximating a matrix by deterministically selecting a subset of its columns with the corresponding largest leverage scores results in a good lowrank matrix surrogate”. To obtain provable guarantees, previous work requires randomized sampli ..."
Abstract
 Add to MetaCart
(Show Context)
We explain theoretically a curious empirical phenomenon: “Approximating a matrix by deterministically selecting a subset of its columns with the corresponding largest leverage scores results in a good lowrank matrix surrogate”. To obtain provable guarantees, previous work requires randomized sampling of the columns with probabilities proportional to their leverage scores. In this work, we provide a novel theoretical analysis of deterministic leverage score sampling. We show that such deterministic sampling can be provably as accurate as its randomized counterparts, if the leverage scores follow a moderately steep powerlaw decay. We support this powerlaw assumption by providing empirical evidence that such decay laws are abundant in realworld data sets. We then demonstrate empirically the performance of deterministic leverage score sampling, which many times matches or outperforms the stateoftheart techniques. 1