### Citations

12893 | The Nature of Statistical Learning Theory
- Vapnik
- 1995
(Show Context)
Citation Context ...uring the past two decades. SVMs belong to the family of large margin methods [18] whereas Boosting belongs to the family of ensemble methods [22]. The former roots in the statistical learning theory =-=[19]-=-, exploiting the kernel trick explicitly to handle nonlinearity with linear classifiers; the latter comes from the proof construction [13] to the theoretical problem that whether weakly learnable equa... |

3596 | Support-vector networks - Cortes, Vapnik - 1995 |

884 | Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics
- Schapire, Freund, et al.
- 1998
(Show Context)
Citation Context ...s knowledge, and thus, to understand why AdaBoost seems resistant to overfitting is the most fascinating fundamental theoretical issue in Boosting studies. To explain this phenomenon, Schapire et al. =-=[14]-=- presented the margin theory for Boosting. Let ... |

850 | The strength of weak learnability
- Schapire
- 1990
(Show Context)
Citation Context ...ods [22]. The former roots in the statistical learning theory [19], exploiting the kernel trick explicitly to handle nonlinearity with linear classifiers; the latter comes from the proof construction =-=[13]-=- to the theoretical problem that whether weakly learnable equals strongly learnable [8]. It is clearly that these two approaches were born with apparent differences. The margin [19] is a fundamental i... |

342 | Cryptographic limitations on learning boolean formulae and automata
- Kearns, Valiant
- 1989
(Show Context)
Citation Context ...l trick explicitly to handle nonlinearity with linear classifiers; the latter comes from the proof construction [13] to the theoretical problem that whether weakly learnable equals strongly learnable =-=[8]-=-. It is clearly that these two approaches were born with apparent differences. The margin [19] is a fundamental issue of SVMs as an intuitive understanding of the behavior of SVMs is to search for a l... |

157 | Empirical margin distributions and bounding the generalization error of combined classi
- Koltchinskii, Panchenko
(Show Context)
Citation Context ...s about data-dependent margin-based generalization bounds, based on techniques such as the empirical cover number [15], empirical fat-shattering dimension [2] and Rademacher and Gaussian complexities =-=[9, 10]-=-. Some of these bounds are proven to be sharper than (3), but hard to show sharper than (4)-(6). Moreover, they fail to explain the resistance of AdaBoost to overfitting. 3 Optimizing Margin Distribut... |

122 | Boosting in the limit: Maximizing the margin of learned ensembles
- Grove, Schuurmans
- 1998
(Show Context)
Citation Context ...sing trees with fixed number of leaves. Reyzin and Schapire found that the trees of arc-gv are generally deeper than that 1 Similar empirical evidences have been reported by other researchers such as =-=[7]-=-. of AdaBoost, and they argued that trees with different heights may be with different model complexities. Then, they repeated Breiman’s experiments using decision stumps with two leaves and observed ... |

55 | How boosting the margin can also boost classifier complexity
- Reyzin, Schapire
- 2006
(Show Context)
Citation Context ... error increases drastically in almost every case.1 Thus, Breiman raised serious doubt about the margin theory, and almost sentenced the margin theory to death. Seven years later, Reyzin and Schapire =-=[12]-=- found that, amazingly, Breiman had not controlled the model complexity well in experiments. To study the margin, one must fix the model complexity of base learners as it is meaningless to compare the... |

43 |
Prediction games and arcing classifiers
- Breiman
- 1997
(Show Context)
Citation Context ..., it attracted a lot of attention. Notice that Schapire et al.’s bound (3) depends heavily on the smallest margin, because Pr... |

40 | Ensemble Methods: Foundations and Algorithms - Zhou - 2012 |

18 | Complexities of convex combinations and bounding the generalization error in classification
- Koltchinskii, Panchanko
(Show Context)
Citation Context ...s about data-dependent margin-based generalization bounds, based on techniques such as the empirical cover number [15], empirical fat-shattering dimension [2] and Rademacher and Gaussian complexities =-=[9, 10]-=-. Some of these bounds are proven to be sharper than (3), but hard to show sharper than (4)-(6). Moreover, they fail to explain the resistance of AdaBoost to overfitting. 3 Optimizing Margin Distribut... |

18 |
Generalization performance of classifiers in terms of observed covering numbers
- Shawe-Taylor, Williamson
- 1999
(Show Context)
Citation Context .... 2 Notice that instead of considering the whole function space, there are some studies about data-dependent margin-based generalization bounds, based on techniques such as the empirical cover number =-=[15]-=-, empirical fat-shattering dimension [2] and Rademacher and Gaussian complexities [9, 10]. Some of these bounds are proven to be sharper than (3), but hard to show sharper than (4)-(6). Moreover, they... |

17 | Data-dependent margin-based generalization bounds for classification
- Antos, Kégl, et al.
- 2002
(Show Context)
Citation Context ...e whole function space, there are some studies about data-dependent margin-based generalization bounds, based on techniques such as the empirical cover number [15], empirical fat-shattering dimension =-=[2]-=- and Rademacher and Gaussian complexities [9, 10]. Some of these bounds are proven to be sharper than (3), but hard to show sharper than (4)-(6). Moreover, they fail to explain the resistance of AdaBo... |

13 | Boosting through optimization of margin distributions
- Shen, Li
- 2010
(Show Context)
Citation Context ... i.e., to optimize the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. This argument has got supported empirically by some recent Boosting studies =-=[16,17]-=-. 4 A Simple Implementation of Large Margin Distribution Learning For a straightforward implementation of large margin distribution learning, as an example, we adapt the simple SVMs formulation (8) to... |

12 | On the margin explanation of boosting algorithm
- Wang, Sugiyama, et al.
- 2008
(Show Context)
Citation Context ...ion bound based on the minimum margin is quite tight. To enable the margin theory to gets renascence, it is crucial to have a sharper bound based on margin distribution. For this purpose, Wang et al. =-=[20]-=- presented a sharper bound in term of the Emargin, i.e., arg inf... |

11 | Margin distribution and learning algorithm
- Garg, Roth
- 2003
(Show Context)
Citation Context ...oice of sample ... |

6 | A kernel method for the optimization of the margin distribution
- Aiolli, Martino, et al.
- 2008
(Show Context)
Citation Context ...timize the margin distribution. Reyzin and Schapire [12] suggested to maximize the average or median margin, and there are also efforts on maximizing the average margin or weighted combination margin =-=[1,6,11]-=-. These arguments, however, are all heuristics without theoretical justification. In addition to (6), Gao and Zhou [5] proved anther form of their margin theorem, disclosing that the average or median... |

5 | A risk minimization principle for a class of parzen estimators
- Pelckmans, Suykens, et al.
- 2007
(Show Context)
Citation Context ...timize the margin distribution. Reyzin and Schapire [12] suggested to maximize the average or median margin, and there are also efforts on maximizing the average margin or weighted combination margin =-=[1,6,11]-=-. These arguments, however, are all heuristics without theoretical justification. In addition to (6), Gao and Zhou [5] proved anther form of their margin theorem, disclosing that the average or median... |

4 | Variance Penalizing AdaBoost
- Shivaswamy, Jebara
- 1908
(Show Context)
Citation Context ... i.e., to optimize the margin distribution by maximizing the margin mean and minimizing the margin variance simultaneously. This argument has got supported empirically by some recent Boosting studies =-=[16,17]-=-. 4 A Simple Implementation of Large Margin Distribution Learning For a straightforward implementation of large margin distribution learning, as an example, we adapt the simple SVMs formulation (8) to... |

3 | On the doubt about margin explanation of boosting
- Zhou
(Show Context)
Citation Context ...long history of research trying to explain Boosting with a margin theory. Though there were twists and turns in this line of studies, recently the margin theory for Boosting has finally been defended =-=[5]-=-, establishing a connection between these two mainstream learning approaches. It is interesting that in contrast to large margin methods that focus on the maximization of a single margin, the recent t... |

1 | Large margin distribution machine
- Zhang, Zhou
- 2014
(Show Context)
Citation Context ...argin variance simultaneously. Inspired by this recognition, we advocate large margin distribution learning, a promising research direction that has already exhibited superiority in algorithm designs =-=[21]-=-. In this article, we will first briefly introduce the efforts on establishing the margin theory of Boosting, and then explain the basic idea of large margin distribution learning. After that, we will... |