MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  Some Greedy Learning Algorithms for Sparse Regression and Classification with Mercer Kernels (2002) [11 citations — 0 self]

Download:
pdf | ps
by Prasanth B. Nair, Andy J. Keane, Carla Brodley, Andrea Danyluk
Journal of Machine Learning Research
http://www.jmlr.org/papers/volume3/nair02a/nair02a.ps.gz
Add To MetaCart

Abstract:

We present some greedy learning algorithms for building sparse nonlinear regression and classification models from observational data using Mercer kernels. Our objective is to develop efficient numerical schemes for reducing the training and runtime complexities of kernel-based algorithms applied to large datasets. In the spirit of Natarajan's greedy algorithm (Natarajan, 1995), we iteratively minimize the L 2 loss function subject to a specified constraint on the degree of sparsity required of the final model until a specified stopping criterion is reached. We discuss various greedy criteria for basis selection and numerical schemes for improving the robustness and computational efficiency. Subsequently, algorithms based on residual minimization and thin QR factorization are presented for constructing sparse regression and classification models. During the course of the incremental model construction, the algorithms are terminated using model selection principles such as the minimum descriptive length (MDL) and Akaike's information criterion (AIC). Finally, experimental results on benchmark data are presented to demonstrate the competitiveness of the algorithms developed in this paper.

Citations

1004 Experiments with a new boosting algorithm – Schapire - 1996
718 Pattern recognition and neural networks – Ripley - 1996
543 Additive logistic regression: a statistical view of boosting – Friedman, Hastie, et al.
441 Atomic decomposition by basis pursuit – Chen, Donoho, et al. - 1999
204 Sparse bayesian learning and the relevance vector machine – Tipping
176 Greedy function approximation: A gradient boosting machine – Friedman
172 New Support Vector Algorithms – Schölkopf, Smola, et al.
155 Regularization networks and support vector machines – Evgeniou, Pontil, et al. - 2000
149 An equivalence between sparse approximation and SupportVector Machines – Girosi - 1998
132 T.: Parallel preconditioning with sparse approximate inverses – Grote, Huckle - 1997
123 Seeger M.: Using the Nyström Method to Speed Up Kernel Machines – Williams - 2001
94 Sparse greedy matrix approximation for machine learning – Smola, Schökopf - 2000
83 Reorthogonalization and stable algorithms for updating the Gram-Schmidt QR factorization – Daniel, Gragg, et al. - 1976
75 Matching Pursuit in a Time-Frequency Dictionary – Mallat, Zhang - 1993
71 Approximate inverse preconditioners via sparse-sparse iterations – Chow - 1998
63 The Nature of Statistical Learning – Vapnik - 1996
59 Sparse approximate solutions to linear systems – Natarajan - 1995
54 Sparse greedy gaussian process regression – Smola, Bartlett - 2001
37 Approximate inverse techniques for block-partitioned matrices – CHOW, SAAD - 1997
31 On selecting models for nonlinear time series – Judd, Mees - 1995
22 Further analysis of the data by Akaike’s information criterion and the finite corrections. Commun. Stat. Theory Methods A7:13–26 – Sugiura - 1978
17 Adaptive greedy techniques for approximate solution of large RBF systems – Schaback, Wendland
14 On the optimality of the Backward Greedy Algorithm for the subset selection problem – Couvreur, Bresler - 2000
13 Orthogonal least squares learning for radial basis function networks – Chen, Cowan, et al. - 1991
12 Comparison of basis selection methods – Adler, Rao, et al. - 1996
5 Boosting with the L 2 loss: regression and classification – Buhlmann, Yu - 2001
5 Local regularization assisted orthogonal least squares regression – Chen - 2001
5 Algorithm 686: FORTRAN Subroutines for Updating the QR Decomposition – Reichel, Gragg - 1990
2 On learning functions from noise-free and noisy examples via Occam's razor – Natarajan - 1999
1 Model selection and the principle of minimum description length – H - 2001
1 Boosting with the L2 loss: regression and classification. Research Report No. 98, Seminar für Statistik – Bühlmann, Yu - 2001