66 citations found. Retrieving documents...
Kearns, M., Mansour, Y., Ng, A., and Ron, D. (1997). An experimental and theoretical comparison of model selection methods. Machine Learning, 27(1):7-50.

 Home/Search   Document Details and Download   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Discriminative, Generative and Imitative Learning - Jebara (2002)   (Correct)

....estimates that are more likely to agree with future data and are based on a measure of the model capacity. This form of regularized ERM has been called structural risk minimization (SRM) where the capacity is measured in terms of the so called Vapnik Chervonenkis dimension [196] 195] 197] 35] [109]. SRM will not perform as well on the training data as ERM but should generalize better to future data. The risk or expected loss (for samples outside of the training set) is then bounded above by the empirical risk plus a term that depends only on the size of training set, T and the ....

M. Kearns, Y. Mansour, Ng. A., and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27(1):7--50, 1997.


On Generalization Bounds, Projection Profile, and Margin.. - Garg, Har-Peled, Roth   (Correct)

....used to establish generalization bounds which seem to be more informative in high dimensional learning problems. The approach is motivated by recent works [GR01, AV99] that argue that some high di Several statistical methods can be used to ensure the robustness of the empirical error estimate [KMNR97]. However, these typically require training on even less data, and do not contribute to understanding the domain. mensional learning problems are naturally constrained in ways that make them, e#ectively, low dimensional. In these cases, although learning is done in a high dimension, ....

M. Kearns, Y. Mansour, A. Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7--50, 1997.


Towards Robust Model Selection using Estimation and - Approximation Error Bounds   (Correct)

....model, poses rather difficult computational issues (see for example [22] We note, however, that in spite of these difficulties, the resampling based methods are probably the most widely used approached to loss estimation and model selection. Moreover, there has been some recent theoretical work [9] indicating their superiority over methods based on distribution free upper bounds. At this point it should be clear to the reader that all the classical approaches to model selection described above suffer from some malady. This problem prompts us to introduce a new model selection criterion, ....

M. Kearns, Y. Mansour, A. Y. Ng and D. Ron. "An Experimental and Theoretical Comparison of Model Selection Methods", in Proceedings of thc seventh workshop on Computational Learning Theory ", ACM Press,


Using a Sample-dependent Coding Scheme for Two-Part MDL - Verbeek   (Correct)

....can lead to poor results in applying Rissanen s Minimum Description Length (MDL) principle [Ris89] The MDL principle is one of the manyknown model selection methods in the field of machine learning , statistics or inductive inference . We analyze the experimental results presented in [KMNR97] and provide a method to avoid the overfitting. We do so by using a differentcodingscheme than in [KMNR97] 1 Introduction Rissanen s Minimum Description Length (MDL) principle [Ris89] is one of the many known model selection methods in the field of machine learning , statistics or inductive ....

....The MDL principle is one of the manyknown model selection methods in the field of machine learning , statistics or inductive inference . We analyze the experimental results presented in [KMNR97] and provide a method to avoid the overfitting. We do so by using a differentcodingscheme than in [KMNR97]. 1 Introduction Rissanen s Minimum Description Length (MDL) principle [Ris89] is one of the many known model selection methods in the field of machine learning , statistics or inductive inference . The general idea of MDL can be explained as follows: Wewant to model some regularities or ....

[Article contains additional citation context not shown here]

M. Kearns, Y. Mansour, A. Ng, and D. Ron. An Experimental and Theoretical Comparison of Model Selection Methods. Machine Learning, 27:7--50, 1997.


Change-Point Estimation Using New Minimum Message Length .. - Fitzgibbon, Dowe.. (2002)   (1 citation)  (Correct)

....problem with Gaussian segments [4] 11, chapter 9] 5, 6] They have derived MML formulas for stating the change point locations to an optimal precision independently of the segment parameters. The same method has been used [7] for the problem of finding change points in noisy binary sequences [14] where it compared favourably with Akaike s Information Criterion (AIC) Schwarz s Bayesian Information Criterion (BIC) an MDL motivated metric of Kearns et al. 14] and a more correct version of Minimum Description Length[7] We apply the MML68 approximation to the binomial problem in this ....

....of the segment parameters. The same method has been used [7] for the problem of finding change points in noisy binary sequences [14] where it compared favourably with Akaike s Information Criterion (AIC) Schwarz s Bayesian Information Criterion (BIC) an MDL motivated metric of Kearns et al. [14] and a more correct version of Minimum Description Length[7] We apply the MML68 approximation to the binomial problem in this paper. Assuming that the true change point is uniformly distributed in some range of width, R, we encode the data using the point estimate # at the centre of this ....

Kearns, M., Mansour, Y., Ng, A.Y., Ron, D.: An experimental and theoretical comparison of model selection methods. Machine Learning 27 (1997) 7--50


Rademacher and Gaussian Complexities: Risk Bounds and.. - Bartlett, Mendelson   (27 citations)  (Correct)

.... a model selection scheme critically depends on how well the error bounds match the true error (see [2] There is theoretical and experiment evidence that error bounds involving a xed complexity penalty (that is, a penalty that does not depend on the training data) cannot be universally e ective [4]. Recently, several authors have considered alternative notions of the complexity of a function class: the Rademacher and gaussian complexities (see [2, 5, 7, 7, 13] De nition 1. Let be a probability distribution on a set X and let X 1 ; Xn be independent samples selected according ....

Michael J. Kearns, Yishay Mansour, Andrew Y. Ng, and Dana Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7-50, 1997.


Cross Validation and MLP Architecture Selection - Andersen, Martinez (1999)   (Correct)

....This study, which applies CV to 14 different real world problems, utilized 74 unix workstations running continuously over a period of approximately two and a half months. The studies in the literature which specifically examine the performance of CV and compare it with that of other methods [8] 10][3] analyze performance using only a few (1 or 2) data sets, and so cannot be considered conclusive. A realistic evaluation of the performance of CV based MLP architecture selection on real world problems, including strength and weaknesses, needs to be established. This paper also examines the ....

....and did not look at any real world problem domains, and so these can not be considered conclusive. Another paper by Kearns et al. found that CV performs significantly better than Minimum Description Length (MDL) and Guaranteed Risk Minimization (GRM) 11] on the intervals model selection problem [3]. Unfortunately, the empirical results in this paper were also limited to a single type of artificial data, and did not explore any real world problem domains. Schaffer has also studied CV in [7] and [8] CV is also employed in stopped training, weight decay, network construction algorithms, and ....

[Article contains additional citation context not shown here]

Kearns, Michael, Yishay Mansour, ndrew Ng, and Dana Ron (1997), "An Experimental and Theoretical Comparison of Model Selection Methods," Machine Learning, vol 27, pp 750.


Generalization Bounds for Linear Learning Algorithms - Garg, Har-Peled, Roth   (Correct)

....in a high dimension, generalization ought to depend on the true, lower dimensionality of the problem. We first show that when the high dimensional data can be classified using a linear classifier, 1 Several statistical methods can be used to ensure the robustness of the empirical error estimate[5]. However, these typically require training on even less data, and do not contribute to understanding generalization. the data (and the separator) can be projected down to a lower dimensional space with quantifiable penalty on the resulting performance; the performance will depend on the ....

M. Kearns, Y. Mansour, A. Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7--50, 1997.


Algorithmic Stability and Sanity-Check Bounds for Leave-One-Out .. - Kearns, Ron (1997)   (42 citations)  Self-citation (Kearns Ron)   (Correct)

No context found.

M. J. Kearns, Y. Mansour, A. Ng, , and D. Ron. An experimental and theoretical comparison of model selection methods. In Proceedings of the Eighth Annual ACM Workshop on Computational Learning Theory, pages 21--30, 1995. To Appear in Machine Learning, COLT95 Special Issue.


A Bound on the Error of Cross Validation Using the Approximation.. - Kearns (1996)   (16 citations)  Self-citation (Kearns)   (Correct)

No context found.

M. Kearns, Y. Mansour, A. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. In Proceedings of the Eigth Annual ACM Conference on Computational Learning Theory, 1995.


The Maximum-Margin Approach to Learning Text Classifiers -.. - Joachims (2000)   (17 citations)  (Correct)

No context found.

Kearns, M., Mansour, Y., Ng, A., and Ron, D. (1997). An experimental and theoretical comparison of model selection methods. Machine Learning, 27(1):7-50.


Journal of Machine Learning Research 3 (2002) 463-482.. - Risk Bounds And   (Correct)

No context found.

Michael J. Kearns, Yishay Mansour, Andrew Y. Ng, and Dana Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7--50, 1997.


The Evolution Of Genetic Representations And Modular Adaptation - Toussaint (2003)   (4 citations)  (Correct)

No context found.

Kearns, M., Y. Mansour, A. Y. Ng, & D. Ron (1995). An experimental and theoretical comparison of model selection methods. In Proceedings of the Workshop on Computational Learning Theory (COLT 1995). Morgan Kaufmann.


Nonparametric Regularization of Decision Trees - Scheffer (2000)   (Correct)

No context found.

M. Kearns, Y. Mansour, A. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning Journal, 27:7-50, 1997.


Subspace Analysis and Optimization for AAM Based Face Alignment - Zhao, Chen, Li, Bu (2004)   (Correct)

No context found.

M. J. Kearns, Y. Mansour, A. Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7--50, 1997.


Expected Error Analysis for Model Selection - Scheffer, Joachims (1999)   (8 citations)  (Correct)

No context found.

M. Kearns, Y. Mansour, A. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning Journal, 27:7--50, 1997.


Estimating the Expected Error of Empirical Minimizers for.. - Scheffer, Joachims (1998)   (7 citations)  (Correct)

No context found.

M. Kearns, Y. Mansour, A. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. In Machine Learning 27: 7--50. 1997.


Subspace Analysis and Optimization for AAM Based Face Alignment - Ming Zhao Chun (2004)   (Correct)

No context found.

M. J. Kearns, Y. Mansour, A. Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7--50, 1997.


Learning Structure and Concepts in Data Through Data Clustering - Hamerly (2003)   (Correct)

No context found.

Michael J. Kearns, Yishay Mansour, Andrew Y. Ng, and Dana Ron. An experimental and theoretical comparison of model selection methods. In Computational Learing Theory (COLT), pages 21-30, 1995.


Suboptimal Behavior of Bayes and MDL in Classification.. - Grünwald, Langford   (Correct)

No context found.

M. Kearns, Y. Mansour, A.Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7-50, 1997.


On Generalization Bounds, Projection Profile, and Margin.. - Garg, Har-Peled, Roth (2002)   (Correct)

No context found.

M. Kearns, Y. Mansour, A. Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7-50, 1997.


Learning the K in K-Means - Hamerly, Elkan (2003)   (4 citations)  (Correct)

No context found.

Michael J. Kearns, Yishay Mansour, Andrew Y. Ng, and Dana Ron. An experimental and theoretical comparison of model selection methods. In Computational Learing Theory (COLT), pages 21--30, 1995.


Learning the K in K-Means - Hamerly, Elkan (2003)   (4 citations)  (Correct)

No context found.

Michael J. Kearns, Yishay Mansour, Andrew Y. Ng, and Dana Ron. An experimental and theoretical comparison of model selection methods. In Computational Learing Theory (COLT), pages 21--30, 1995.


Discriminative, Generative and Imitative Learning - Jebara (2002)   (Correct)

No context found.

M. Kearns, Y. Mansour, Ng. A., and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27(1):7--50, 1997.


Learning with Kernel Machine Architectures - Evgeniou (2000)   (1 citation)  (Correct)

No context found.

M. Kearns, Y. Mansour, A. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. In Proceedings of the Eighth Annual ACM Conference on Computational Learning Theory, 1995.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC