Results 1 
7 of
7
Accelerated Large Scale Optimization by Concomitant Hashing
"... Abstract. Traditional localitysensitive hashing (LSH) techniques aim to tackle the curse of explosive data scale by guaranteeing that similar samples are projected onto proximal hash buckets. Despite the success of LSH on numerous vision tasks like image retrieval and object matching, however, its ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Traditional localitysensitive hashing (LSH) techniques aim to tackle the curse of explosive data scale by guaranteeing that similar samples are projected onto proximal hash buckets. Despite the success of LSH on numerous vision tasks like image retrieval and object matching, however, its potential in largescale optimization is only realized recently. In this paper we further advance this nascent area. We first identify two common operations known as the computational bottleneck of numerous optimization algorithms in a largescale setting, i.e., min/max inner product. We propose a hashing scheme for accelerating min/max inner product, which exploits properties of order statistics of statistically correlated random vectors. Compared with other schemes, our algorithm exhibits improved recall at a lower computational cost. The effectiveness and efficiency of the proposed method are corroborated by theoretic analysis and several important applications. Especially, we use the proposed hashing scheme to perform approximate ℓ1 regularized least squares with dictionaries with millions of elements, a scale which is beyond the capability of currently known exact solvers. Nonetheless, it is highlighted that the focus of this paper is not on a new hashing scheme for approximate nearest neighbor problem. It exploits a new application for the hashing techniques and proposes a general framework for accelerating a large variety of optimization procedures in computer vision. 1
A novel greedy algorithm for Nyström approximation
"... The Nyström method is an efficient technique for obtaining a lowrank approximation of a large kernel matrix based on a subset of its columns. The quality of the Nyström approximation highly depends on the subset of columns used, which are usually selected using random sampling. This paper presents ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
The Nyström method is an efficient technique for obtaining a lowrank approximation of a large kernel matrix based on a subset of its columns. The quality of the Nyström approximation highly depends on the subset of columns used, which are usually selected using random sampling. This paper presents a novel recursive algorithm for calculating the Nyström approximation, and an effective greedy criterion for column selection. Further, a very efficient variant is proposed for greedy sampling, which works on random partitions of data instances. Experiments on benchmark data sets show that the proposed greedy algorithms achieve significant improvements in approximating kernel matrices, with minimum overhead in run time.
Largescale SVD and Manifold Learning
"... This paper examines the efficacy of samplingbased lowrank approximation techniques when applied to large dense kernel matrices. We analyze two common approximate singular value decomposition techniques, namely the Nyström and Column sampling methods. We present a theoretical comparison between ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
This paper examines the efficacy of samplingbased lowrank approximation techniques when applied to large dense kernel matrices. We analyze two common approximate singular value decomposition techniques, namely the Nyström and Column sampling methods. We present a theoretical comparison between these two methods, provide novel insights regarding their suitability for various tasks and present experimental results that support our theory. Our results illustrate the relative strengths of each method. We next examine the performance of these two techniques on the largescale task of extracting lowdimensional manifold structure given millions of highdimensional face images. We address the computational challenges of nonlinear dimensionality reduction via Isomap and Laplacian Eigenmaps, using a graph containing about 18 million nodes and 65 million edges. We present extensive experiments on learning lowdimensional embeddings for two large face data sets: CMUPIE (35 thousand faces) and a web data set (18 million faces). Our comparisons show that the Nyström approximation is superior to the Column sampling method for this task. Furthermore, approximate Isomap tends to perform better than Laplacian Eigenmaps on both clustering and classification with the labeled CMUPIE data set.
Chapter 6 Largescale Manifold Learning
"... The problem of dimensionality reduction arises in many computer vision applications, where it is natural to represent images as vectors in a highdimensional space. Manifold learning techniques extract lowdimensional structure from highdimensional data in an unsupervised manner. These techniques t ..."
Abstract
 Add to MetaCart
The problem of dimensionality reduction arises in many computer vision applications, where it is natural to represent images as vectors in a highdimensional space. Manifold learning techniques extract lowdimensional structure from highdimensional data in an unsupervised manner. These techniques typically
Bounding the Difference Between RankRC and RankSVM and Application to MultiLevel Rare Class Kernel Ranking
"... Rapid explosion in data accumulation has yielded larger and larger data mining problems. Many practical problems have intrinsically unbalanced or rare class distributions. Standard classification algorithms, which focus on overall classification accuracy, often perform poorly in these cases. Recentl ..."
Abstract
 Add to MetaCart
(Show Context)
Rapid explosion in data accumulation has yielded larger and larger data mining problems. Many practical problems have intrinsically unbalanced or rare class distributions. Standard classification algorithms, which focus on overall classification accuracy, often perform poorly in these cases. Recently, Tayal et al. (2013) proposed a kernel method called RankRC for largescale unbalanced learning. RankRC uses a ranking loss to overcome biases inherent in standard classification based loss functions, while achieving computational efficiency by enforcing a rare class hypothesis representation. In this paper we establish a theoretical bound for RankRC by establishing equivalence between instantiating a hypothesis using a subset of training points and instantiating a hypothesis using the full training set but with the feature mapping equal to the orthogonal projection of the original mapping. This bound suggests that it is optimal to select points from the rare class first when choosing the subset of data points for a hypothesis representation. In addition, we show that for an arbitrary loss function, the Nyström kernel matrix approximation is equivalent to instantiating a hypothesis using a subset of data points. Consequently, a theoretical bound for the Nyström kernel SVM can be established based on the perturbation analysis of the orthogonal projection in
ACCELERATING CHEMICAL SIMILARITY SEARCH USING GPUS AND METRIC EMBEDDINGS
, 2011
"... Fifteen years ago, the advent of modern highthroughput sequencing revolutionized computational genetics with a flood of data. Today, highthroughput biochemical assays promise to make biochemistry the next datarich domain for machine learning. However, existing computational methods, built for sm ..."
Abstract
 Add to MetaCart
(Show Context)
Fifteen years ago, the advent of modern highthroughput sequencing revolutionized computational genetics with a flood of data. Today, highthroughput biochemical assays promise to make biochemistry the next datarich domain for machine learning. However, existing computational methods, built for small analyses of about 1,000 molecules, do not scale to emerging multimillion molecule datasets. For many algorithms, pairwise similarity comparisons between molecules are a critical bottleneck, presenting a 1,000×1,000,000× scaling barrier. In this dissertation, I describe the design of SIML and PAPER, our GPU implementations of 2D and 3D chemical similarities, as well as SCISSORS, our metric embedding algorithm. On a model problem of interest, combining these techniques allows up to 274,000x speedup in time and up to 2.8 millionfold reduction in space while retaining excellent accuracy. I further discuss how these highspeed techniques have allowed insight into chemical shape similarity and the behavior of machine learning kernel methods in the presence of noise. iv
SOFTWARE ARTICLE Interpretation and approximation tools
"... Background: Markov chains are a common framework for individual‑based state and time discrete models in evolu‑ tion. Though they played an important role in the development of basic population genetic theory, the analysis of more complex evolutionary scenarios typically involves approximation with o ..."
Abstract
 Add to MetaCart
(Show Context)
Background: Markov chains are a common framework for individual‑based state and time discrete models in evolu‑ tion. Though they played an important role in the development of basic population genetic theory, the analysis of more complex evolutionary scenarios typically involves approximation with other types of models. As the number of states increases, the big, dense transition matrices involved become increasingly unwieldy. However, advances in computational technology continue to reduce the challenges of “big data”, thus giving new potential to state‑rich Markov chains in theoretical population genetics. Results: Using a population genetic model based on genotype frequencies as an example, we propose a set of methods to assist in the computation and interpretation of big, dense Markov chain transition matrices. With the help of network analysis, we demonstrate how they can be transformed into clear and easily interpretable graphs, provid‑ ing a new perspective even on the classic case of a randomly mating, finite population with mutation. Moreover, we describe an algorithm to save computer memory by substituting the original matrix with a sparse approximate while preserving its mathematically important properties, including a closely corresponding dominant (normalized) eigen‑ vector. A global sensitivity analysis of the approximation results in our example shows that size reduction of more than 90 % is possible without significantly affecting the basic model results. Sample implementations of our methods