| Kearns, M., Mansour, Y., Ng, A., and Ron, D. (1997). An experimental and theoretical comparison of model selection methods. Machine Learning, 27(1):7-50. |
....estimates that are more likely to agree with future data and are based on a measure of the model capacity. This form of regularized ERM has been called structural risk minimization (SRM) where the capacity is measured in terms of the so called Vapnik Chervonenkis dimension [196] 195] 197] 35] [109]. SRM will not perform as well on the training data as ERM but should generalize better to future data. The risk or expected loss (for samples outside of the training set) is then bounded above by the empirical risk plus a term that depends only on the size of training set, T and the ....
M. Kearns, Y. Mansour, Ng. A., and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27(1):7--50, 1997.
....used to establish generalization bounds which seem to be more informative in high dimensional learning problems. The approach is motivated by recent works [GR01, AV99] that argue that some high di Several statistical methods can be used to ensure the robustness of the empirical error estimate [KMNR97]. However, these typically require training on even less data, and do not contribute to understanding the domain. mensional learning problems are naturally constrained in ways that make them, e#ectively, low dimensional. In these cases, although learning is done in a high dimension, ....
M. Kearns, Y. Mansour, A. Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7--50, 1997.
....model, poses rather difficult computational issues (see for example [22] We note, however, that in spite of these difficulties, the resampling based methods are probably the most widely used approached to loss estimation and model selection. Moreover, there has been some recent theoretical work [9] indicating their superiority over methods based on distribution free upper bounds. At this point it should be clear to the reader that all the classical approaches to model selection described above suffer from some malady. This problem prompts us to introduce a new model selection criterion, ....
M. Kearns, Y. Mansour, A. Y. Ng and D. Ron. "An Experimental and Theoretical Comparison of Model Selection Methods", in Proceedings of thc seventh workshop on Computational Learning Theory ", ACM Press,
....can lead to poor results in applying Rissanen s Minimum Description Length (MDL) principle [Ris89] The MDL principle is one of the manyknown model selection methods in the field of machine learning , statistics or inductive inference . We analyze the experimental results presented in [KMNR97] and provide a method to avoid the overfitting. We do so by using a differentcodingscheme than in [KMNR97] 1 Introduction Rissanen s Minimum Description Length (MDL) principle [Ris89] is one of the many known model selection methods in the field of machine learning , statistics or inductive ....
....The MDL principle is one of the manyknown model selection methods in the field of machine learning , statistics or inductive inference . We analyze the experimental results presented in [KMNR97] and provide a method to avoid the overfitting. We do so by using a differentcodingscheme than in [KMNR97]. 1 Introduction Rissanen s Minimum Description Length (MDL) principle [Ris89] is one of the many known model selection methods in the field of machine learning , statistics or inductive inference . The general idea of MDL can be explained as follows: Wewant to model some regularities or ....
[Article contains additional citation context not shown here]
M. Kearns, Y. Mansour, A. Ng, and D. Ron. An Experimental and Theoretical Comparison of Model Selection Methods. Machine Learning, 27:7--50, 1997.
....problem with Gaussian segments [4] 11, chapter 9] 5, 6] They have derived MML formulas for stating the change point locations to an optimal precision independently of the segment parameters. The same method has been used [7] for the problem of finding change points in noisy binary sequences [14] where it compared favourably with Akaike s Information Criterion (AIC) Schwarz s Bayesian Information Criterion (BIC) an MDL motivated metric of Kearns et al. 14] and a more correct version of Minimum Description Length[7] We apply the MML68 approximation to the binomial problem in this ....
....of the segment parameters. The same method has been used [7] for the problem of finding change points in noisy binary sequences [14] where it compared favourably with Akaike s Information Criterion (AIC) Schwarz s Bayesian Information Criterion (BIC) an MDL motivated metric of Kearns et al. [14] and a more correct version of Minimum Description Length[7] We apply the MML68 approximation to the binomial problem in this paper. Assuming that the true change point is uniformly distributed in some range of width, R, we encode the data using the point estimate # at the centre of this ....
Kearns, M., Mansour, Y., Ng, A.Y., Ron, D.: An experimental and theoretical comparison of model selection methods. Machine Learning 27 (1997) 7--50
.... a model selection scheme critically depends on how well the error bounds match the true error (see [2] There is theoretical and experiment evidence that error bounds involving a xed complexity penalty (that is, a penalty that does not depend on the training data) cannot be universally e ective [4]. Recently, several authors have considered alternative notions of the complexity of a function class: the Rademacher and gaussian complexities (see [2, 5, 7, 7, 13] De nition 1. Let be a probability distribution on a set X and let X 1 ; Xn be independent samples selected according ....
Michael J. Kearns, Yishay Mansour, Andrew Y. Ng, and Dana Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7-50, 1997.
....This study, which applies CV to 14 different real world problems, utilized 74 unix workstations running continuously over a period of approximately two and a half months. The studies in the literature which specifically examine the performance of CV and compare it with that of other methods [8] 10][3] analyze performance using only a few (1 or 2) data sets, and so cannot be considered conclusive. A realistic evaluation of the performance of CV based MLP architecture selection on real world problems, including strength and weaknesses, needs to be established. This paper also examines the ....
....and did not look at any real world problem domains, and so these can not be considered conclusive. Another paper by Kearns et al. found that CV performs significantly better than Minimum Description Length (MDL) and Guaranteed Risk Minimization (GRM) 11] on the intervals model selection problem [3]. Unfortunately, the empirical results in this paper were also limited to a single type of artificial data, and did not explore any real world problem domains. Schaffer has also studied CV in [7] and [8] CV is also employed in stopped training, weight decay, network construction algorithms, and ....
[Article contains additional citation context not shown here]
Kearns, Michael, Yishay Mansour, ndrew Ng, and Dana Ron (1997), "An Experimental and Theoretical Comparison of Model Selection Methods," Machine Learning, vol 27, pp 750.
....in a high dimension, generalization ought to depend on the true, lower dimensionality of the problem. We first show that when the high dimensional data can be classified using a linear classifier, 1 Several statistical methods can be used to ensure the robustness of the empirical error estimate[5]. However, these typically require training on even less data, and do not contribute to understanding generalization. the data (and the separator) can be projected down to a lower dimensional space with quantifiable penalty on the resulting performance; the performance will depend on the ....
M. Kearns, Y. Mansour, A. Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7--50, 1997.
No context found.
M. J. Kearns, Y. Mansour, A. Ng, , and D. Ron. An experimental and theoretical comparison of model selection methods. In Proceedings of the Eighth Annual ACM Workshop on Computational Learning Theory, pages 21--30, 1995. To Appear in Machine Learning, COLT95 Special Issue.
No context found.
M. Kearns, Y. Mansour, A. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. In Proceedings of the Eigth Annual ACM Conference on Computational Learning Theory, 1995.
No context found.
Kearns, M., Mansour, Y., Ng, A., and Ron, D. (1997). An experimental and theoretical comparison of model selection methods. Machine Learning, 27(1):7-50.
No context found.
Michael J. Kearns, Yishay Mansour, Andrew Y. Ng, and Dana Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7--50, 1997.
No context found.
Kearns, M., Y. Mansour, A. Y. Ng, & D. Ron (1995). An experimental and theoretical comparison of model selection methods. In Proceedings of the Workshop on Computational Learning Theory (COLT 1995). Morgan Kaufmann.
No context found.
M. Kearns, Y. Mansour, A. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning Journal, 27:7-50, 1997.
No context found.
M. J. Kearns, Y. Mansour, A. Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7--50, 1997.
No context found.
M. Kearns, Y. Mansour, A. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning Journal, 27:7--50, 1997.
No context found.
M. Kearns, Y. Mansour, A. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. In Machine Learning 27: 7--50. 1997.
No context found.
M. J. Kearns, Y. Mansour, A. Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7--50, 1997.
No context found.
Michael J. Kearns, Yishay Mansour, Andrew Y. Ng, and Dana Ron. An experimental and theoretical comparison of model selection methods. In Computational Learing Theory (COLT), pages 21-30, 1995.
No context found.
M. Kearns, Y. Mansour, A.Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7-50, 1997.
No context found.
M. Kearns, Y. Mansour, A. Y. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27:7-50, 1997.
No context found.
Michael J. Kearns, Yishay Mansour, Andrew Y. Ng, and Dana Ron. An experimental and theoretical comparison of model selection methods. In Computational Learing Theory (COLT), pages 21--30, 1995.
No context found.
Michael J. Kearns, Yishay Mansour, Andrew Y. Ng, and Dana Ron. An experimental and theoretical comparison of model selection methods. In Computational Learing Theory (COLT), pages 21--30, 1995.
No context found.
M. Kearns, Y. Mansour, Ng. A., and D. Ron. An experimental and theoretical comparison of model selection methods. Machine Learning, 27(1):7--50, 1997.
No context found.
M. Kearns, Y. Mansour, A. Ng, and D. Ron. An experimental and theoretical comparison of model selection methods. In Proceedings of the Eighth Annual ACM Conference on Computational Learning Theory, 1995.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC