Results 1  10
of
10
The fastest deformable part model for object detection
 In CVPR
, 2014
"... This paper solves the speed bottleneck of deformable part model (DPM), while maintaining the accuracy in detection on challenging datasets. Three prohibitive steps in cascade version of DPM are accelerated, including 2D correlation between root filter and feature map, cascade part pruning and HOG ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
(Show Context)
This paper solves the speed bottleneck of deformable part model (DPM), while maintaining the accuracy in detection on challenging datasets. Three prohibitive steps in cascade version of DPM are accelerated, including 2D correlation between root filter and feature map, cascade part pruning and HOG feature extraction. For 2D correlation, the root filter is constrained to be low rank, so that 2D correlation can be calculated by more efficient linear combination of 1D correlations. A proximal gradient algorithm is adopted to progressively learn the low rank filter in a discriminative manner. For cascade part pruning, neighborhood aware cascade is proposed to capture the dependence in neighborhood regions for aggressive pruning. Instead of explicit computation of part scores, hypotheses can be pruned by scores of neighborhoods under the first order approximation. For HOG feature extraction, lookup tables are constructed to replace expensive calculations of orientation partition and magnitude with simpler matrix index operations. Extensive experiments show that (a) the proposed method is 4 times faster than the current fastest DPM method with similar accuracy on Pascal VOC, (b) the proposed method achieves stateoftheart accuracy on pedestrian and face detection task with framerate speed. 1.
Predicting parameters in deep learning
 In Proc. NIPS
, 2013
"... We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of t ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
We demonstrate that there is significant redundancy in the parameterization of several deep learning models. Given only a few weight values for each feature it is possible to accurately predict the remaining values. Moreover, we show that not only can the parameter values be predicted, but many of them need not be learned at all. We train several different architectures by learning only a small number of weights and predicting the rest. In the best case we are able to predict more than 95 % of the weights of a network without any drop in accuracy. 1
Toward Fast Transform Learning
, 2013
"... The dictionary learning problem aims at finding a dictionary of atoms that best represents an image according to a given objective. The most usual objective consists of representing an image or a class of images sparsely. Most algorithms performing dictionary learning iteratively estimate the dictio ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
The dictionary learning problem aims at finding a dictionary of atoms that best represents an image according to a given objective. The most usual objective consists of representing an image or a class of images sparsely. Most algorithms performing dictionary learning iteratively estimate the dictionary and a sparse representation of images using this dictionary. Dictionary learning has led to many state of the art algorithms in image processing. However, its numerical complexity restricts its use to atoms with a small support since the computations using the constructed dictionaries require too much resources to be deployed for large scale applications. In order to alleviate these issues, this paper introduces a new strategy to learn dictionaries composed of atoms obtained as a composition of K convolutions with Ssparse kernels. The dictionary update step associated with this strategy is a nonconvex optimization problem. We reformulate the problem in order to reduce the number of its irrelevant stationary points and introduce a GaussSeidel type algorithm, referred to as Alternative Least Square Algorithm, for its resolution. The search space of the considered optimization problem is of dimension KS, which is typically smaller than the size of the target atom and is much smaller than the size of the image. The complexity of the algorithm is linear with regard to the size of the image. Our experiments show that we are able to approximate with a very high accuracy many atoms such as modified DCT, curvelets, sinc functions or cosines when K is large (say K = 10). We also argue empirically that, maybe surprisingly, the algorithm generally converges to a global minimum for large values of K and S.
Sparse spacetime deconvolution for Calcium image analysis
"... We describe a unified formulation and algorithm to find an extremely sparse representation for Calcium image sequences in terms of cell locations, cell shapes, spike timings and impulse responses. Solution of a single optimization problem yields cell segmentations and activity estimates that are on ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
We describe a unified formulation and algorithm to find an extremely sparse representation for Calcium image sequences in terms of cell locations, cell shapes, spike timings and impulse responses. Solution of a single optimization problem yields cell segmentations and activity estimates that are on par with the state of the art, without the need for heuristic pre or postprocessing. Experiments on real and synthetic data demonstrate the viability of the proposed method. 1
Dynamic Texture Recognition via Orthogonal Tensor Dictionary Learning
"... Dynamic textures (DTs) are video sequences with stationary properties, which exhibit repetitive patterns over space and time. This paper aims at investigating the sparse coding based approach to characterizing local DT patterns for recognition. Owing to the high dimensionality of DT sequences, exis ..."
Abstract
 Add to MetaCart
(Show Context)
Dynamic textures (DTs) are video sequences with stationary properties, which exhibit repetitive patterns over space and time. This paper aims at investigating the sparse coding based approach to characterizing local DT patterns for recognition. Owing to the high dimensionality of DT sequences, existing dictionary learning algorithms are not suitable for our purpose due to their high computational costs as well as poor scalability. To overcome these obstacles, we proposed a structured tensor dictionary learning method for sparse coding, which learns a dictionary structured with orthogonality and separability. The proposed method is very fast and more scalable to highdimensional data than the existing ones. In addition, based on the proposed dictionary learning method, a DT descriptor is developed, which has better adaptivity, discriminability and scalability than the existing approaches. These advantages are demonstrated by the experiments on multiple datasets. 1.
TILDE: A Temporally Invariant Learned DEtector
"... We introduce a learningbased approach to detect repeatable keypoints under drastic imaging changes of weather and lighting conditions to which stateoftheart keypoint detectors are surprisingly sensitive. We first identify good keypoint candidates in multiple training images taken from the same ..."
Abstract
 Add to MetaCart
(Show Context)
We introduce a learningbased approach to detect repeatable keypoints under drastic imaging changes of weather and lighting conditions to which stateoftheart keypoint detectors are surprisingly sensitive. We first identify good keypoint candidates in multiple training images taken from the same viewpoint. We then train a regressor to predict a score map whose maxima are those points so that they can be found by simple nonmaximum suppression. As there are no standard datasets to test the influence of these kinds of changes, we created our own, which we will make publicly available. We will show that our method significantly outperforms the stateoftheart methods in such challenging conditions, while still achieving stateoftheart performance on the untrained standard Oxford dataset. 1.
JADERBERG, VEDALDI, AND ZISSERMAN: SPEEDING UP CONVOLUTIONAL... 1 Speeding up Convolutional Neural Networks with Low Rank Expansions
"... The focus of this paper is speeding up the evaluation of convolutional neural networks. While delivering impressive results across a range of computer vision and machine learning tasks, these networks are computationally demanding, limiting their deployability. Convolutional layers generally consum ..."
Abstract
 Add to MetaCart
(Show Context)
The focus of this paper is speeding up the evaluation of convolutional neural networks. While delivering impressive results across a range of computer vision and machine learning tasks, these networks are computationally demanding, limiting their deployability. Convolutional layers generally consume the bulk of the processing time, and so in this work we present two simple schemes for drastically speeding up these layers. This is achieved by exploiting crosschannel or filter redundancy to construct a low rank basis of filters that are rank1 in the spatial domain. Our methods are architecture agnostic, and can be easily applied to existing CPU and GPU convolutional frameworks for tuneable speedup performance. We demonstrate this with a real world network designed for scene text character recognition, showing a possible 2.5 × speedup with no loss in accuracy, and 4.5 × speedup with less than 1 % drop in accuracy, still achieving stateoftheart on standard benchmarks. 1
published by EURASIP SEPARABLE COSPARSE ANALYSIS OPERATOR LEARNING
"... The ability of having a sparse representation for a certain class of signals has many applications in data analysis, image processing, and other research fields. Among sparse representations, the cosparse analysis model has recently gained increasing interest. Many signals exhibit a multidimensio ..."
Abstract
 Add to MetaCart
(Show Context)
The ability of having a sparse representation for a certain class of signals has many applications in data analysis, image processing, and other research fields. Among sparse representations, the cosparse analysis model has recently gained increasing interest. Many signals exhibit a multidimensional structure, e.g. images or threedimensional MRI scans. Most data analysis and learning algorithms use vectorized signals and thereby do not account for this underlying structure. The drawback of not taking the inherent structure into account is a dramatic increase in computational cost. We propose an algorithm for learning a cosparse Analysis Operator that adheres to the preexisting structure of the data, and thus allows for a very efficient implementation. This is achieved by enforcing a separable structure on the learned operator. Our learning algorithm is able to deal with multidimensional data of arbitrary order. We evaluate our method on volumetric data at the example of threedimensional MRI scans.
Supervised by
, 2013
"... Learned image features can provide great accuracy in many Computer Vision tasks. However, when the convolution filters used to learn image features are numerous and not separable, feature extraction becomes computationally demanding and impractical to use in realworld situations. In this thesis wo ..."
Abstract
 Add to MetaCart
(Show Context)
Learned image features can provide great accuracy in many Computer Vision tasks. However, when the convolution filters used to learn image features are numerous and not separable, feature extraction becomes computationally demanding and impractical to use in realworld situations. In this thesis work, a method for learning a small number of separable filters to approximate an arbitrary nonseparable filter bank is developed. In this approach, separable filters are learned by grouping the arbitrary filters into a tensor and optimizing a tensor decomposition problem. The separable filter learning with tensor decomposition is general and can be applied to generic filter banks to reduce the computational burden of convolutions without a loss in performance. Moreover, the proposed approach is orders of magnitude faster than the approach of a very recent paper based on `1norm minimization [34]. Acknowledgements I would like to express my deepest gratitude to my supervisor, Prof. Pascal
SUBMITTED TO IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 1 Multiscale Centerline
"... Abstract—Finding the centerline and estimating the radius of linear structures is a critical first step in many applications, ranging from road delineation in 2D aerial images to modeling blood vessels, lung bronchi, and dendritic arbors in 3D biomedical image stacks. Existing techniques rely either ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract—Finding the centerline and estimating the radius of linear structures is a critical first step in many applications, ranging from road delineation in 2D aerial images to modeling blood vessels, lung bronchi, and dendritic arbors in 3D biomedical image stacks. Existing techniques rely either on filters designed to respond to ideal cylindrical structures or on classification techniques. The former tend to become unreliable when the linear structures are very irregular while the latter often has difficulties distinguishing centerline locations from neighboring ones, thus losing accuracy. We solve this problem by reformulating centerline detection in terms of a regression problem. We first train regressors to return the distances to the closest centerline in scalespace, and we apply them to the input images or volumes. The centerlines and the corresponding scale then correspond to the regressors local maxima, which can be easily identified. We show that our method outperforms stateoftheart techniques for various 2D and 3D datasets. Moreover, our approach is very generic and also performs well on contour detection. We show an improvement above recent contour detection algorithms on the BSDS500 dataset. F 1