#### DMCA

## The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization (2011)

### Cached

### Download Links

Citations: | 145 - 7 self |

### Citations

1858 | Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories
- Lazebnik, Schmid, et al.
- 2006
(Show Context)
Citation Context ... Bellevue, WA, USA, 2011. Copyright 2011 by the author(s)/owner(s). higher level image representations. For instance, the K-means algorithm is often used in “visual word models” (Csurka et al., 2004; =-=Lazebnik et al., 2006-=-) to train a dictionary of exemplar low-level descriptors that are then used to define a mapping (an “encoding”) of the descriptors into a new feature space. More recently, machine learning research h... |

921 | A fast learning algorithm for deep belief networks - Hinton, Osindero, et al. |

820 | Independent component analysis: Algorithms and applications. Neural Networks
- Hyvärinen, Oja
- 2000
(Show Context)
Citation Context ...e patches or image descriptors harvested from unlabeled data. When learning from raw pixels, we extract 6 pixel square patches, yielding a bank of vectors that are then normalized 1 and ZCA whitened (=-=Hyvarinen & Oja, 2000-=-) (retaining full variance). If we are learning from SIFT descriptors, we simply take single descriptors to form a bank of 128-dimensional vectors. Given the batch of input vectors, x (i) ∈ R n , an u... |

481 | Linear spatial pyramid matching using sparse coding for image classification
- Yang, Yu, et al.
- 2009
(Show Context)
Citation Context ... better features than those learned with VQ methods. One alternative to VQ that has served in this role is sparse coding, which has consistently yielded better results on benchmark recognition tasks (=-=Yang et al., 2009-=-; Boureau et al., 2010). A natural question is whether this higher performance is the result of learning a better dictionary for representing the structure of the data, or whether sparse codes are sim... |

430 | Locality-constrained linear coding for image classification - WANG, YANG, et al. - 2010 |

381 | Greedy layer-wise training of deep networks
- Bengio, Lamblin, et al.
- 2006
(Show Context)
Citation Context ...ve for the optimal D in (2). 3. Sparse RBMs and sparse auto-encoders (RBM, SAE): In some of our experiments, we train sparse RBMs (Hinton et al., 2006) and sparse auto-encoders (Ranzato et al., 2007; =-=Bengio et al., 2006-=-), both using a logistic sigmoid nonlinearity g(Wx + b). These algorithms yield a set of weights W and biases b. To obtain the dictionary, D, we simply discard the biases and take D = W ⊤ , then norma... |

251 | Learning methods for generic object recognition with invariance to pose and lighting
- LeCun, Huang, et al.
- 2004
(Show Context)
Citation Context ...e a dictionary with d = 6000 basis vectors, we can achieve 81.5% accuracy—the best known result on CIFAR. 4.2. Experiments on NORB We also perform experiments on the NORB (jitteredcluttered) dataset (=-=LeCun et al., 2004-=-). Each 108x108 image includes 2 gray stereo channels. We resize the images to 96x96 pixels and average-pool over a 5x5 grid. We train on the first 2 folds of training data (58320 examples), and test ... |

239 | What is the best multistage architecture for object recognition
- Jarrett, Kavukcuoglu, et al.
- 2009
(Show Context)
Citation Context ...olumns of D with normalized vectors sampled randomly from amongst the x (i) . 5. Random weights (R): It has also been shown that completely random weights can perform surprisingly well in some tasks (=-=Jarrett et al., 2009-=-; Saxe et al., 2010). Thus, we have also tried filling the columns of D with vectors sampled from a unit normal distribution (subsequently normalized to unit length). After running any of the above tr... |

222 | Learning midlevel features for recognition
- Boureau, Bach, et al.
- 2010
(Show Context)
Citation Context ...an those learned with VQ methods. One alternative to VQ that has served in this role is sparse coding, which has consistently yielded better results on benchmark recognition tasks (Yang et al., 2009; =-=Boureau et al., 2010-=-). A natural question is whether this higher performance is the result of learning a better dictionary for representing the structure of the data, or whether sparse codes are simply better non-linear ... |

208 | An analysis of singlelayer networks in unsupervised feature learning,” AISTATS - Coates, Lee, et al. - 2011 |

154 | Rectified linear units improve Restricted Boltzmann Machines
- Nair, Hinton
- 2010
(Show Context)
Citation Context ...l., 2008), where a feed-forward network is trained explicitly to mimic sparse coding. It has also become popular as the non-linearity in various deep learning architectures (Kavukcuoglu et al., 2010; =-=Nair & Hinton, 2010-=-; Krizhevsky, 2010), and is often referred to as a “shrinkage” function for its role in regularization and sparse coding algorithms (Gregor & LeCun, 2010). Thus, we are by no means the first to observ... |

124 | Sparse feature learning for deep belief networks - Ranzato, Boureau, et al. - 2007 |

122 | Kernel codebooks for scene categorization - vanGemert, Geusebroek, et al. |

105 | Coordinate descent algorithms for lasso penalized regression - Wu, Lange |

71 | Hyperfeatures Multilevel Local Coding for Visual Recognition
- Agarwal, Triggs
- 2006
(Show Context)
Citation Context ...ding scheme, it has been shown that soft encodings (e.g., using Gaussian RBFs) yield better features even when hard assignment was used during training (van Gemert et al., 2008; Boureau et al., 2010; =-=Agarwal & Triggs, 2006-=-). In our experiments, we will exploit the ability to “mix and match” training and encoding algorithms in this way to analyze the contributions of eachThe Importance of Encoding Versus Training with ... |

66 | Fast inference in sparse coding algorithms with applications to object recognition - Kavukcuoglu, Ranzato, et al. - 2008 |

50 | Learning fast approximations of sparse coding
- Gregor, LeCun
- 2010
(Show Context)
Citation Context ...arning architectures (Kavukcuoglu et al., 2010; Nair & Hinton, 2010; Krizhevsky, 2010), and is often referred to as a “shrinkage” function for its role in regularization and sparse coding algorithms (=-=Gregor & LeCun, 2010-=-). Thus, we are by no means the first to observe the usefulness of this particular activation function. In our work, however, we will show that such a nonlinearity on its own is consistently able to c... |

43 | On random weights and unsupervised feature learning
- Saxe, Koh, et al.
(Show Context)
Citation Context ...lized vectors sampled randomly from amongst the x (i) . 5. Random weights (R): It has also been shown that completely random weights can perform surprisingly well in some tasks (Jarrett et al., 2009; =-=Saxe et al., 2010-=-). Thus, we have also tried filling the columns of D with vectors sampled from a unit normal distribution (subsequently normalized to unit length). After running any of the above training procedures, ... |

42 | Improved local coordinate coding using local tangents
- Yu, Zhang
- 2010
(Show Context)
Citation Context ... / SC 78.8% SC / T 78.9% OMP-1 / SC 78.8% OMP-1 / T 79.4% OMP-10 / T 80.1% OMP-1 / T (d = 6000) 81.5% (Coates et al., 2011) 1600 features 77.9% (Coates et al., 2011) 4000 features 79.6% Improved LCC (=-=Yu & Zhang, 2010-=-) 74.5% Conv. DBN (Krizhevsky, 2010) 78.9% Deep NN (Ciresan et al., 2011) 80.49% 1600 entries from whitened, 6 by 6 pixel color image patches (108-dimensional vectors), using sparse coding (SC), ortho... |

38 | Evaluation of pooling operations in convolutional architectures for object recognition
- Scherer, Müller, et al.
- 2010
(Show Context)
Citation Context ...RB jittered-cluttered dataset. All numbers are percent accuracy. Encoder Natural SC (λ = 1) T (α = 0.5) Train R 91.9 93.8 93.1 RP 92.8 95.0 93.6 SC λ = 1 94.1 94.1 93.5 OMP-1 90.9 94.2 92.6 Conv.Net (=-=Scherer et al., 2010-=-) 94.4% SVM-Conv.Net (Huang & LeCun, 2006) 94.1% ReLU RBM (Nair & Hinton, 2010) 84.8% tively. This suggests that the strength of sparse coding on CIFAR comes not from the learned basis functions, but ... |

35 | Convolutional Deep Belief Networks on CIFAR-10
- Krizhevsky
- 2010
(Show Context)
Citation Context ...ed-forward network is trained explicitly to mimic sparse coding. It has also become popular as the non-linearity in various deep learning architectures (Kavukcuoglu et al., 2010; Nair & Hinton, 2010; =-=Krizhevsky, 2010-=-), and is often referred to as a “shrinkage” function for its role in regularization and sparse coding algorithms (Gregor & LeCun, 2010). Thus, we are by no means the first to observe the usefulness o... |

20 | On the difference between orthogonal matching pursuit and orthogonal least squares
- Blumensath, Davies
- 2007
(Show Context)
Citation Context ...nd ||s (i) ||0 ≤ k, ∀i where ||s (i) ||0 is the number of non-zero elements in s (i) . In this case, the codes s (i) are computed (approximately) using Orthogonal Matching Pursuit (Pati et al., 1993; =-=Blumensath & Davies, 2007-=-) to compute codes with at most k non-zeros (which we refer to as “OMP-k”). For a single input x (i) , OMP-k begins with s (i) = 0 and at each iteration greedily selects an element of s (i) to be made... |

14 |
Sparse coding of sensory inputs. Current opinion in neurobiology
- Olshausen, Field
- 2004
(Show Context)
Citation Context ...ut data. This increases the chances that a few basis vectors will be near to an input, yielding a large activation that is useful for identifying the location of the input on the data manifold later (=-=Olshausen & Field, 2004-=-; Yu et al., 2009). This explains why vector quantization is quite capable of competing with more complex algorithms: it simply ensures that there is at least one dictionary entry near any densely pop... |

1 |
Schmidhuber, Importance of Encoding Versus Training with Sparse Coding and Vector Quantization Jürgen. High-performance neural networks for visual object classification
- Ciresan, Meier, et al.
- 2011
(Show Context)
Citation Context ... 80.1% OMP-1 / T (d = 6000) 81.5% (Coates et al., 2011) 1600 features 77.9% (Coates et al., 2011) 4000 features 79.6% Improved LCC (Yu & Zhang, 2010) 74.5% Conv. DBN (Krizhevsky, 2010) 78.9% Deep NN (=-=Ciresan et al., 2011-=-) 80.49% 1600 entries from whitened, 6 by 6 pixel color image patches (108-dimensional vectors), using sparse coding (SC), orthogonal matching pursuit (OMP) with k = 1,2,5,10, sparse RBMs (RBM), spars... |