| H. Drucker, C. Cortes, L.D. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6:1289--1301, 1994. |
....It emphasizes interaction and cooperation among the individual networks in the ensemble, and uses an unsupervised penalty term in the error function to produce biased individual networks whose errors tend to be negatively correlated. This approach is quite different from existing ones [4] [5] which train the individual networks independently or sequentially. Rosen [6] proposed an ensemble learning algorithm using decorrelated neural networks. The idea is that individual networks attempt to not only minimize the error between the target and their output, but also decorrelate their ....
....networks are very small. It has been found that the combining results are weakened if the errors of individual networks are positively correlated [3] 4] Common approaches to dealing with this issue are to obtain unbiased estimators whose estimation errors are as weakly correlated as possible [5]. In contrast, this paper describes a new approach to create biased estimators whose estimation errors are negatively correlated. B. Simultaneous Learning of Negatively Correlated Neural Networks CELS introduces a correlation penalty term into the error function of each individual network so ....
H. Drucker, C. Cortes, L. D. Jackel, Y. LeCun, and V. Vapnik, "Boosting and other ensemble methods," Neural Computat., vol. 6, pp. 1289--1301, 1994.
....in the development of clustering, classification, prediction and parameter estimation algorithms for time series ( dynamic ) problems. Some remarkable efforts in this direction include partition algorithms [10, 19] mixtures of experts [5, 12, 13, 14, 15, 16, 25] ensembles of neural networks [3, 7, 26], trees of neural networks [17, 32] threshold models [35] Takagi Sugeno fuzzy models [34] and much more. For an extensive bibliographical coverage see the books [22, 31] The predictor architecture proposed in this paper is modular in the sense that it makes concurrent use of several ....
H. Drucker et al., "Boosting and other ensemble methods", Neural Computation, 1994, vol.6, pp.1289-1301.
....Ensembles are sets of learning machines whose decisions are combined to improve the performance of the overall system. In this last decade one of the main research areas in machine learning has been represented by methods for constructing ensembles of learning machines. Although in the literature [86, 129, 130, 69, 61, 23, 33, 12, 7, 37] a plethora of terms, such as committee, classifier fusion, combination, aggregation and others are used to indicate sets of learning machines that work together to solve a machine learning problem, in this paper we shall use the term ensemble in its widest meaning, in order to include the whole ....
.... or weighted voting in classification problems [120, 121] While in bagging the samples are drawn with replacement using a uniform probability distribution, in boosting methods the learning algorithm is called at each iteration using a di#erent distribution or weighting over the training examples [111, 40, 112, 39, 115, 110, 32, 38, 33, 32, 16, 17, 42, 41]. This technique places the highest weight on the examples most often misclassified by the previous base learner: in this way the base learner focuses its attention on the hardest examples. Then the boosting algorithm combines the base rules taking a weighted majority vote of the base rules. ....
H. Drucker, C. Cortes, L. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6(6):1289--1301, 1994.
....in a 10.4. RESULTS 137 committee. In addition to comparing boosting with the mixtures of experts, the next section also investigates using the experts resulting from boosting as initialisations for the mixtures of experts models. This boosting procedure is based on the work of Drucker et al. [48] but has been generalised to multiple classes by Cook [38] Cook and Robinson [40] have also adapted the boosting procedure at a word level to the RNN acoustic model ABBOT system and achieved a 92:2n reduction in word error rate relative to a baseline model with an equivalent number of ....
....Specifically if the first two networks agree on the classification of a frame, the frame is discarded otherwise the frame was added to the third data set. This method is efficient in the amount of training data used and typically uses only of the available training data. Although Drucker et al. [48] suggest the use of a voting procedure for the boosted networks, Cook [38] reports that a simple average over the probabilities networks works more effectively. In this section I describe results obtained with mixtures of MLPs as acoustic models in the ABBOT system. The acoustic models were ....
Drucker, H., Cortes, C., Jackel, L. D., LeCun, Y. and Vapnik, V. [1994], `Boosting and other ensemble methods', Neural Computation 6, 1289--1301.
....space, they would follow di erent trajectories in the functional space. However, if a random initialization by chance gives a set of weights that are far from a solution, convergence can be exceedingly slow. Manipulation of training data has been the most widely investigated method. Boosting [14], bagging, disjoint input sources [15] nonlinear transformations of input [15] and noise injection [16] have all proved their worth. Manipulating the network topologies would mean having hybrid ensembles, consisting of estimators that work in di erent search spaces entirely. Di erent areas of ....
H. Drucker, C. Cortes, L. Jackel, Y. LeCun, and V. Vapnik, \Boosting and other ensemble methods," 1994.
....in the development of clustering, classification, prediction and parameter estimation algorithms for time series ( dynamic ) problems. Some remarkable e#orts in this direction include partition algorithms [10, 19] mixtures of experts [5, 12, 13, 14, 15, 16, 25] ensembles of neural networks [3, 7, 26], trees of neural networks [17, 32] threshold models [35] Takagi Sugeno fuzzy models [34] and much more. For an extensive bibliographical coverage see the books [22, 31] The predictor architecture proposed in this paper is modular in the sense that it makes concurrent use of several ....
H. Drucker et al., "Boosting and other ensemble methods", Neural Computation, 1994, vol.6, pp.1289-1301.
....by any one of them acting alone [7] Committee machines can be built in two di erent ways. One is to use a static structure. This is known generally as an ensemble method. Here, the input is not involved in combining committee members. Examples include ensemble averaging [8] and boosting [9]. The other method for building committees is to use a dynamic structure. This includes combining local experts such as mixtures of experts [10] Here input is directly involved in the combining mechanism that uses an integrating unit such as a gating network adjusting the weights of committee ....
Drucker, H., Cortes, C., Jackel, L.D., LeCun, Y., Vapnik, V.: Boosting and Other Ensemble Methods. Neural Computation, 6(6) (1994) 1289-1301.
....1. Introduction Let 5506 918 be a set (pool) of classifiers such that , where # 34306 6 , assigns ) a class label , The majority vote method of combining classifier decisions, one of many methods in this important research area [2, 3, 4, 5, 6, 7, 8, 9], is to assign the class label , to ) that is supported by the majority of the classifiers . Finding independent classifiers is one aim of classifier fusion methods for the following reason. Let L be odd, 0 21 , and all classifiers have the same classification accuracy ....
H. Drucker, C. Cortes, L. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6:1289--1301, 1994.
....Class indi erent fusion 1. Introduction Combining classi ers to achieve higher accuracy is an important research topic with di erent names in the literature: # combination of multiple classi ers [1 5] # classi er fusion [6 10] # mixture of experts [11 14] # committees of neural networks [15,16]; # consensus aggregation [17 19] # voting pool of classi ers [20] # dynamic classi er selection [3] # composite classi er system [21] # classi er ensembles [16,22] # divide and conquer classi ers [23] # pandemonium system of re#ective agents [24] # change glasses approach to ....
.... of multiple classi ers [1 5] # classi er fusion [6 10] # mixture of experts [11 14] # committees of neural networks [15,16] # consensus aggregation [17 19] # voting pool of classi ers [20] # dynamic classi er selection [3] # composite classi er system [21] # classi er ensembles [16,22]; # divide and conquer classi ers [23] # pandemonium system of re#ective agents [24] # change glasses approach to classi er selection [25] etc. The paradigms of these models di er on the: assumptions about classi er dependencies; type of classi er outputs; aggregation strategy (global or ....
H. Drucker, C. Cortes, L.D. Jackel, Y. LeCun, V. Vapnik, Boosting and other ensemble methods, Neural Comput. 6 (1994) 1289}1301.
.... function may also be incorporated in each of the MLPs [153] Given a xed feature extraction method one can either use a common training set to design a number of dioeerent types of classiers [76] or, alternatively, use dioeerent training sets to design several versions of one type of classier [34, 33, 64, 146, 155]. 4.10 On Comparing Classiers Some classication accuracies attained using the classication algorithms described in the previous sections will be presented later in this text in Section 5.2.4. Such 32 comparisons need, however, to be considered with utmost caution. During the last years, a ....
H. Drucker, C. Cortes, L. D. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6(6):12891301, September 1994.
....a majority vote to produce a more reliable system. This idea of diversity of failure can be used to improve the performance of neural nets. Diverse nets can be combined to produce a more 2 reliable output. The idea of combining nets to improve performance is not new; e.g. Breiman, 1992, 1994; Drucker et al., 1994; Hansen and Salamon, 1990; Perrone and Cooper, 1993; Sharkey, Sharkey and Chandroth, 1996; Tumer and Ghosh, 1996; Wolpert, 1992) See Sharkey, 1996) for a review of this work. However, although these researchers consider the combining of nets for the purposes of improved performance, they do not ....
.... validation and bootstrapping (Raviv Intrator, 1996; Krogh Vedelsby, 1995) non linear transformations (Sharkey, Sharkey and Chandroth, 1996) injection of noise during training (Raviv Intrator, 1996) data from different sensors (Sharkey, Sharkey Chandroth, 1996) the boosting algorithm (Drucker et al., 1994); and the use of different methods of preprocessing. Cross validation and bootstrapping both involve taking overlapping subsamples of a data set. Nonlinear transformations, and injection of noise during training involve changing the inputs in a training set such that a new function is computed. ....
Drucker, H., Cortes, C., Jackel, L.D., LeCun, Y., & Vapnik, V. (1994) Boosting and other ensemble methods. Neural Computation, 6, 1289-1301.
....AF = 1 seems to be the most sensible value. We have therefore chosen to use the error function 1 2 (y P j c j f j ) 2 . 4 Comparison of Meta Machine Learning Methods There are other MML methods besides AdaBoost, Bagging, Simple, and XuME, for example di erent kinds of boosting (see e.g. [9]) or stacking (see [29] and [28] and other ensemble methods can be found in [22] and [27] We call XuME presented in section 3.4 and related methods (e.g. 17] and [26] for (H)ME 5 . H)ME are not the only form of ME. There is an older version of ME, that can be found in [15] Some very ....
Drucker, H., Cortes, C., Jackel, L. D., LeCun, Y., and Vapnik, V. Boosting and other ensemble methods. Neural Computation 6, 6 (1994), 1289-1301.
....AF = 1 seems to be the most sensible value. We have therefore chosen to use the error function 1 2 (y P j c j f j ) 2 . 4 Comparison of Meta Machine Learning Methods There are other MML methods besides AdaBoost, Bagging, Simple, and XuME, for example di erent kinds of boosting (see e.g. [9]) or stacking (see [29] and [28] and other ensemble methods can be found in [22] and [27] We call XuME presented in section 3.4 and related methods (e.g. 17] and [26] for (H)ME 5 . H)ME are not the only form of ME. There is an older version of ME, that can be found in [15] Some very ....
Drucker, H., Cortes, C., Jackel, L. D., LeCun, Y., and Vapnik, V. Boosting and other ensemble methods. Neural Computation 6, 6 (1994), 1289-1301.
.... (Perrone and Cooper, 1993b) Ali and Pazzani discuss the relationship between error correlations and error reductions in the context of decision trees (Ali and Pazzani, 1995) The Boosting algorithm trains subsequent classifiers on training patterns that have been selected by earlier classifiers (Drucker et al. 1994), thus reducing the correlation among them. However, one can quickly run out of training data in practice if this approach is used. Twomey and Smith discuss combining and resampling in the context of a 1 d regression problem (Twomey and Smith, 1995) Meir discusses the effect of independence on ....
Drucker, H., Cortes, C., Jackel, L. D., LeCun, Y., and Vapnik, V. (1994). Boosting and other ensemble methods. Neural Computation, 6(6):1289--1301.
....all of which try to solve the same problem. The goal is to obtain better results and improved reliability in terms of smaller confidence intervals for the answers. Such efforts have been termed as ensemble networks, committees, boosting, combiners or hybrid networks by different authors [74, 86, 38, 70, 39, 81, 36, 88, 24]. These works are quite different in scope to the work presented here, as each network is trying to solve the same task, and is trained using a common data set or some suitable variation (e.g. bagging etc. 85] Similar in structure is Wolpert s stacked generalization [87] In this architecture, ....
H. Drucker, C. Cortes, L. D. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6(6):1289--1301, 1994.
....the subsequent samples so that a specialized classi er is designed for these. After training, the committee s weighted vote determines the class membership of an unlabeled input. Two very successful committee training methods are Boosting and Bagging with an expanding family of variants thereof [3, 6, 20]. In this group of methods the individual classi ers and the combination are trained together (coverage optimization) The other approach to designing a classi er 1 combination is to use already trained classi ers (typically a smaller number of them compared to boosting and bagging) and combine ....
H. Drucker, C. Cortes, L.D. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6:1289-1301, 1994.
....Much of the work has been done with neural networks as the individual members of the team although neural networks are not necessarily the best choice. Bagging and boosting methods for team members generation, and variants thereof such as arcing and wagging, have been proven to be very successful [3, 7, 25]. These algorithms build a set of diverse classi ers that exhibit a remarkably good performance as a team. One of the simplest, most intuitive, and theoretically sound methods for combining classi er outputs is the majority vote [2, 17, 19, 26, 33] Let D = fD 1 ; DL g be a set (pool, ....
H. Drucker, C. Cortes, L.D. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6:1289-1301, 1994.
.... Combining classi ers to achieve higher accuracy is an important research topic with di erent names in the literature: combination of multiple classi ers ( 1, 2, 3, 4, 5] classi er fusion ( 6, 7, 8, 9, 10] mixture of experts ( 11, 12, 13, 14] committees of neural networks ([15, 16]) consensus aggregation ( 17, 18, 19] voting pool of classi ers ( 20] dynamic classi er selection ( 3] Research supported by ONR grant N00014 96 1 0642 composite classi er system ( 21] classi er ensembles ( 16, 22] divide and conquer classi ers [23] ....
.... ( 11, 12, 13, 14] committees of neural networks ( 15, 16] consensus aggregation ( 17, 18, 19] voting pool of classi ers ( 20] dynamic classi er selection ( 3] Research supported by ONR grant N00014 96 1 0642 composite classi er system ( 21] classi er ensembles ([16, 22]) divide and conquer classi ers [23] pandemonium system of re ective agents [24] change glasses approach to classi er selection [25] etc. The paradigms of these models di er on the: assumptions about classi er dependencies; type of classi er outputs; aggregation strategy (global or ....
H. Drucker, C. Cortes, L.D. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6:1289-1301, 1994.
....more reliable) than the classi cation decision of the best individual classi er. This idea appears under a variety of names in the literature: classi er fusion [1] 2] classi er combination [3] 4] 5] 6] 7] mixture of experts [8] 9] 10] 11] committees of neural networks [12] [13], consensus aggregation [14] 15] 16] voting pool of classi ers [17] classi er ensembles [13] 18] etc. As the grey lines and ellipses in Figure 1 show, multiclassi er systems di er by 1. The number L of individual classi ers used. 2. The type of the individual classi ers. Some ....
.... under a variety of names in the literature: classi er fusion [1] 2] classi er combination [3] 4] 5] 6] 7] mixture of experts [8] 9] 10] 11] committees of neural networks [12] 13] consensus aggregation [14] 15] 16] voting pool of classi ers [17] classi er ensembles [13], 18] etc. As the grey lines and ellipses in Figure 1 show, multiclassi er systems di er by 1. The number L of individual classi ers used. 2. The type of the individual classi ers. Some combination scheme use classi ers of the same types, e.g. neural networks [17] 9] linear classi ers ....
[Article contains additional citation context not shown here]
H. Drucker, C. Cortes, L.D. Jackel, Y. LeCun, and V. Vapnik, \Boosting and other ensemble methods," Neural Computation, vol. 6, pp. 1289-1301, 1994.
No context found.
H. Drucker, C. Cortes, L.D. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6:1289--1301, 1994.
No context found.
H. Druker, C. Cortes, L. Jackel, Y. LeCum, and V. Vaprik, "Boosting and other ensemble methods", Neural Computation 6, pp. 1289-1301, 1994.
No context found.
H. Drucker, C. Cortes, L. D. Jackel, Y. LeCun, and V. Vapnik, "Boosting and other ensemble methods," Neural Comput., vol. 6, pp. 1289--1301, 1994.
No context found.
H. Drucker, C. Cortes, L. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6:1289-1301, September 1994.
No context found.
H. Drucker, C. Cortes, L. D. Jackel, Y. LeCun, and V. Vapnik. Boosting and other ensemble methods. Neural Computation, 6(6):1289--1301, 1994.
No context found.
H. Drucker, C. Cortes, L. D. Jackel, Y. LeCun and V. Vapnik, "Boosting and other ensemble methods," Neural Computation, Vol. 6, pp. 12891301, 1994.
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC