DMCA
Toward parallel and distributed learning by meta-learning (1993)
Venue: | In Working |
Citations: | 99 - 27 self |
Citations
5962 |
Classification and regression trees
- Breiman, Friedman, et al.
- 1984
(Show Context)
Citation Context ...application in a parallel environmentcan be done efficiently according to the scheme proposed in [18]. 4 Experiments Four inductive learning algorithms were used in our experiments. ID3 [15] and CART =-=[1]-=- were obtained from NASA Ames Research Center in the IND package [2]. They are both decision tree learning algorithms. WPEBLS is the weighted version of PEBLS [8], which is a memory-based learning alg... |
4373 | Simplifying decision trees
- Quinlan
- 1999
(Show Context)
Citation Context ... on reliable data. On the contrary, for noisy data, windowing considerably slows down the computation. Catlett [3] demonstrates that larger amounts of data improves accuracy, but he projects that ID3 =-=[15]-=- on modern machines will take several months to learn from a million records in the flight data set obtained from NASA. He proposes some improvements to the ID3 algorithm particularly for handling att... |
890 | The CN2 induction algorithm
- Clark, Niblett
- 1989
(Show Context)
Citation Context ...rning algorithms. WPEBLS is the weighted version of PEBLS [8], which is a memory-based learning algorithm. BAYES is a simple Bayesian learner based on conditional probabilities, which is described in =-=[7]-=-. The latter two algorithms were reimplemented in C. Two data sets, obtained from the UCI Machine Learning Database, were used in our studies. The secondary protein structure data set (SS) [13], court... |
869 | The strength of weak learnability
- Schapire
- 1990
(Show Context)
Citation Context ...e the subject matter of ongoing experimentation. Schapire's hypothesis boosting Our ideas are related to using meta-learning to improve accuracy. The most notable work in this area is due to Schapire =-=[16]-=-, which he refers to as hypothesis boosting. Based on an initial learned hypothesis for some concept derived from a random distribution of training data, Schapire's scheme iteratively generates two ad... |
731 | Stacked generalization.
- Wolpert
- 1992
(Show Context)
Citation Context ...through meta-learning, higher accuracy can be obtained. We call this approach multistrategy hypothesis boosting. Preliminary results reported in [4] are encouraging. Zhang et al.'s [24] and Wolpert's =-=[22]-=- work is in this direction. Silver et al.'s [17] and Holder's [10] work also employs multiple learners, but no learning is involved at the meta level. Since the ultimate goal of this work is to improv... |
515 | Boosting a weak learning algorithm by majority
- Freund
- 1995
(Show Context)
Citation Context ...concurrently. We use three distribution as well, but the first two are independent and are available simultaneously. The third distribution, for the arbiter, however, depends on the first two. Freund =-=[9]-=- has a similar approach, but with potentially many more distributions. Again, the distributions can only be generated iteratively. Work in progress In addition to applying metalearning to combining re... |
309 | A weighted nearest neighbor algorithm for learning with symbolic features.
- Cost, Salzberg
- 1993
(Show Context)
Citation Context ...in our experiments. ID3 [15] and CART [1] were obtained from NASA Ames Research Center in the IND package [2]. They are both decision tree learning algorithms. WPEBLS is the weighted version of PEBLS =-=[8]-=-, which is a memory-based learning algorithm. BAYES is a simple Bayesian learner based on conditional probabilities, which is described in [7]. The latter two algorithms were reimplemented in C. Two d... |
271 | Sejnowski T: Predicting the secondary structure of globular proteins using neural network models.
- Qian
- 1988
(Show Context)
Citation Context ...ibed in [7]. The latter two algorithms were reimplemented in C. Two data sets, obtained from the UCI Machine Learning Database, were used in our studies. The secondary protein structure data set (SS) =-=[13]-=-, courtesy of Qian and Sejnowski, contains sequences of amino acids and the secondary structures at the corresponding positions. There are three structures (three classes) and 20 amino acids (21 attri... |
210 | Stacked Generalization. Neural Networks, - Wolpert - 1992 |
195 | Refinement of approximate domain theories by knowledge-based neural networks.
- Towell, Shavlik, et al.
- 1990
(Show Context)
Citation Context ...nto a training and test set, which are disjoint, according to the distribution described in [13]. The training set has 18105 instances and the test set has 3520. The DNA splice junction data set (SJ) =-=[19]-=-, courtesy of Towell, Shavlik and Noordewier, contains sequences of nucleotides and the type of splice junction, if any, (three classes) at the center of each sequence. Each sequence has 60 nucleotide... |
118 | Systems for knowledge discovery in databases,
- Matheus, Chan, et al.
- 1993
(Show Context)
Citation Context ...bases will be available for various learning problems of real world importance. The Grand Challenges of HPCC [20] are perhaps the best examples. Learning techniques are central to knowledge discovery =-=[11]-=- and the approach proposed here may substantially increase the amount of data a Knowledge Discovery system can handle effectively. Quinlan [14] approached the problem of efficiently applying learning ... |
84 | Experiments on multistrategy learning by meta-learning. In:
- Chan, Stolfo
- 1993
(Show Context)
Citation Context ... the literature on this approach beyond what was first reported in [18] in the domain of speech recognition. Work on using meta-learning for combining different learning systems is reported elsewhere =-=[4, 6]-=- and is further discussed at the end of this paper. In the next section we will discuss our approach on the how to use meta-learning for parallel learning using only one learning algorithm. 3 Parallel... |
83 |
Hybrid system for protein secondary structure prediction
- Zhang
- 1992
(Show Context)
Citation Context ...ults intelligently through meta-learning, higher accuracy can be obtained. We call this approach multistrategy hypothesis boosting. Preliminary results reported in [4] are encouraging. Zhang et al.'s =-=[24]-=- and Wolpert's [22] work is in this direction. Silver et al.'s [17] and Holder's [10] work also employs multiple learners, but no learning is involved at the meta level. Since the ultimate goal of thi... |
67 |
Megainduction: A test flight
- Catlett
- 1991
(Show Context)
Citation Context ...h and Catlett [21] show that the windowing technique does not significantly improve speed on reliable data. On the contrary, for noisy data, windowing considerably slows down the computation. Catlett =-=[3]-=- demonstrates that larger amounts of data improves accuracy, but he projects that ID3 [15] on modern machines will take several months to learn from a million records in the flight data set obtained f... |
66 |
Meta-learning for multistrategy and parallel learning
- Chan, Stolfo
- 1993
(Show Context)
Citation Context ...lel or distributed learning processes, meta-learning can also be used to coalesce the results from multiple different inductive learning algorithms applied to the same set of data to improve accuracy =-=[5]-=-. The premise is that different algorithms have different representations and search heuristics, different search spaces are being explored and hence potentially diversed results can be obtained from ... |
20 |
ILS: A Framework for Multi-Paradigmatic Learning
- Silver, Frawley, et al.
- 1990
(Show Context)
Citation Context ...tained. We call this approach multistrategy hypothesis boosting. Preliminary results reported in [4] are encouraging. Zhang et al.'s [24] and Wolpert's [22] work is in this direction. Silver et al.'s =-=[17]-=- and Holder's [10] work also employs multiple learners, but no learning is involved at the meta level. Since the ultimate goal of this work is to improve both the accuracy and efficiency of machine le... |
20 |
An efficient implementation of the back-propagation algorithm on the connection machine
- Zhang, Mckenna, et al.
- 1990
(Show Context)
Citation Context ...em is to parallelize the learning algorithms and apply the parallelized algorithm to the entire data set (presumably utilizing multiple I/O channels to handle the I/O bottleneck). Zhang et al.'s work =-=[23]-=- on parallelizing the backpropagation algorithm on a Connection Machine is one example. This approach requires optimizing the code for a particular algorithm on a specific architecture. Another approa... |
15 |
Toward multistrategy parallel and distributed learning in sequence analysis
- Chan, Stolfo
- 1993
(Show Context)
Citation Context ... the literature on this approach beyond what was first reported in [18] in the domain of speech recognition. Work on using meta-learning for combining different learning systems is reported elsewhere =-=[4, 6]-=- and is further discussed at the end of this paper. In the next section we will discuss our approach on the how to use meta-learning for parallel learning using only one learning algorithm. 3 Parallel... |
14 |
Induction over large data bases
- QUinlan
- 1979
(Show Context)
Citation Context .... Learning techniques are central to knowledge discovery [11] and the approach proposed here may substantially increase the amount of data a Knowledge Discovery system can handle effectively. Quinlan =-=[14]-=- approached the problem of efficiently applying learning systems to data that are substantially larger than available main memory with a windowing technique. A learning algorithm is applied to a small... |
12 |
Experiments on the costs and benefits of windowing in ID3
- Wirth, Catlett
- 1988
(Show Context)
Citation Context ...s is repeated on a new window of the same size with some of the incorrectly classified data replacing some of the data in the old window until all the data are correctly classified. Wirth and Catlett =-=[21]-=- show that the windowing technique does not significantly improve speed on reliable data. On the contrary, for noisy data, windowing considerably slows down the computation. Catlett [3] demonstrates t... |
8 | Speech Recognition in Parallel
- Stolfo, Galil, et al.
- 1989
(Show Context)
Citation Context ...ifferent subsets of the data in parallel and the use of meta-learning to combine the partial results. We are not aware of any work in the literature on this approach beyond what was first reported in =-=[18]-=- in the domain of speech recognition. Work on using meta-learning for combining different learning systems is reported elsewhere [4, 6] and is further discussed at the end of this paper. In the next s... |
7 | Selection of learning methods using an adaptive model of knowledge utility
- Holder
- 1991
(Show Context)
Citation Context ...is approach multistrategy hypothesis boosting. Preliminary results reported in [4] are encouraging. Zhang et al.'s [24] and Wolpert's [22] work is in this direction. Silver et al.'s [17] and Holder's =-=[10]-=- work also employs multiple learners, but no learning is involved at the meta level. Since the ultimate goal of this work is to improve both the accuracy and efficiency of machine learning, we have be... |
6 |
The need for biases in learning generalizaions
- Mitchell
- 1980
(Show Context)
Citation Context ... algorithms have different representations and search heuristics, different search spaces are being explored and hence potentially diversed results can be obtained from different algorithms. Mitchell =-=[12]-=- refers to this phenomenon as inductive bias. We postulate that by combining the different results intelligently through meta-learning, higher accuracy can be obtained. We call this approach multistra... |
3 |
Introduction to IND and Recursive Partitioning. NASAAmes Research
- Buntine, Caruana
- 1991
(Show Context)
Citation Context ...ng to the scheme proposed in [18]. 4 Experiments Four inductive learning algorithms were used in our experiments. ID3 [15] and CART [1] were obtained from NASA Ames Research Center in the IND package =-=[2]-=-. They are both decision tree learning algorithms. WPEBLS is the weighted version of PEBLS [8], which is a memory-based learning algorithm. BAYES is a simple Bayesian learner based on conditional prob... |
2 |
et al. High performance computing and communications for grand challenge applications: Computer vision, speech and natural language processing, and artificial intelligence
- Wah
- 1993
(Show Context)
Citation Context ...age of very large network computing, it is likely that orders of magnitudemore data in databases will be available for various learning problems of real world importance. The Grand Challenges of HPCC =-=[20]-=- are perhaps the best examples. Learning techniques are central to knowledge discovery [11] and the approach proposed here may substantially increase the amount of data a Knowledge Discovery system ca... |
1 | The CN2 induction algorithm - Biol - 1993 |
1 | The strength of weak learnability - Biol - 1992 |
1 | Induction over large data bases - Eng - 1993 |