| P. Chan, An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD Dissertation, Dept of Computer Science, Columbia University, New York, 1996. |
....model assumes that all the data required by any data mining algorithm is either available at or can be sent to a central site. A simple approach to data mining over multiple sources that will not share data is to run existing data mining tools at each site independently and combine the results[5, 6, 17]. However, this will often fail to give globally valid results. Issues that cause a disparity between local and global results include: Values for a single entity may be split across sources. Data mining at individual sites will be unable to detect cross site correlations. The same item ....
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Department of Computer Science, Columbia University, New York, NY, 1996. (Technical Report CUCS-044-96).
....scan the dataset less than once from the secondary storage. Our approach is applicable not only to decision trees but also to other learners, e.g. rule and naive Bayes learners. Ensemble of classifiers has been studied as a general approach for scalable learning. Previously proposed metalearning [Chan, 1996] reduces the number of data scans to 2. However, empirical studies have shown that the accuracy of the multiple model is sometimes lower than respective single model. Bagging [Breiman, 1996] and boosting [Freund and Schapire, 1997] are not scalable since both methods scan the dataset multiple ....
....that can fit into main memory. The splitting criterion in each node of the tree is tested against multiple decision trees trained from bootstrap samples of the sampled data. It refines the tree later by scanning the complete data set, resulting in a total of two complete data read. Meta learning [Chan, 1996] builds a tree of classifiers and combine class label outputs from base classifiers. It is based on heuristics and the total number of datascan is two. The improvements by our methods are in many folds. We combine probabilities instead of class labels. The combining technique is straightforward ....
P Chan. An Extensible Meta-learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Columbia University, Oct 1996.
....aggregate query such as avg(AGE) during query processing. One of the most noteworthy work is due to [7] which provides an interactive and accurate method to estimate the result of aggregation. One of the earliest work to use data reduction techniques to scale up inductive learning is due to Chan [1], in which he builds a tree of classifiers. In BOAT [6] Gehrke et al. build multiple bootstrapped trees in memory to examine the splitting conditions of a coarse tree. There has been several advances in cost sensitive learning [3] MetaCost [4] takes advantage of purposeful mis labels to maximize ....
P. Chan. An Extensible Meta-learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Columbia University, Oct 1996.
....analysis and understanding of the characteristics of that set of classifiers. 5.2.1 Diversity Brodley [ C. Brodley, 1993 ] defines diversity by measuring the classification overlap of a pair of classifiers, i.e. the percentage of the instances classified the same way by two classifiers while Chan [ Chan, 1996 ] associates it with the entropy in the set of predictions of the base classifiers. When the predictions of the classifiers are distributed evenly across the possible classes, the entropy is higher and the set of classifiers is more diverse. Other metrics studying diversity include the ....
....accuracy potential. But they fail to evaluate classifiers with respect to cost models or take into account that classes have varying significance and are associated with di#erent costs. The class specialty metric is created to address this limitation. The term specialty was first defined by Chan [ Chan, 1996 ] to be equal to one minus the average normalized entropy over K classifiers: specialty = 1 log m p ji log(p ji ) 5.2) 67 where m represents the number of classes and p ji denotes the normalized accuracy of the j base classifier on the i class. In essence, the larger the ....
[Article contains additional citation context not shown here]
Chan, P. 1996. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. Ph.D. Dissertation, Department of Computer Science, Columbia University, New York, NY.
....model assumes that all the data required by any data mining algorithm is either available at or can be sent to a central site. A simple approach to data mining over multiple sources that will not share data is to run existing data mining tools at each site independently and combine the results[5, 6, 18]. However, this will often fail to give globally valid results. Issues that cause a disparity between local and global results include: Values for a single entity may be split across sources. Data mining at individual sites will be unable to detect cross site correlations. The same item ....
....al. proposed a method for horizontally partitioned data[8] and more recent work has addressed privacy in this model[14] Distributed classification has also been addressed. A meta learning approach has been developed that uses classifiers trained at di#erent sites to develop a global classifier [5, 6, 18]. This could protect the individual entities, but it remains to be shown that the individual classifiers do not disclose private information. Recent work has addressed classification using Bayesian Networks in vertically partitioned data [7] and situations where the distribution is itself ....
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Department of Computer Science, Columbia University, New York, NY, 1996. (Technical Report CUCS-044-96).
....process. In 8 this experiment, each site contributes a single classi er agent. During the meta learning phase of the rst fold, Marmalade applies the three base classi er agents Mango.1, Marmalade. 1 and Strawberry.1 on the hypo.1. bld data subset using the 2 fold meta learning scheme([6]) to generate the meta level training set. The nal ensemble meta classi er, noted as Meta Classi er.1 is computed via the stacking method using the native bay train Bayesian learning algorithm over this meta level training set. Marmalade will employ the Meta Classi er.1 to predict the classes ....
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Department of Computer Science, Columbia University, New York, NY, 1996.
....[2] Recently, several parallel and distributed computing approaches have been proposed. The main aim behind these approaches is to search techniques that are suitable for huge amounts of data that cannot efficiently be handled by main memorybased learning algorithms. It has been shown in [2 4] that parallel and distributed processing provides the best hope of dealing with large amounts of data. In this paper, we consider an approach that uses classifiers learned on a number of data subsets in parallel and that selects for each new instance the best classifier dynamically. This ....
....chapter 4, we propose a combination of our dynamic classifier selection technique with the arbiter meta learning. In chapter 5, we present results of our experiments with the approach, and chapter 6 concludes with a brief summary and further research topics. 2 Arbiter Meta Learning Technique In [2 4] the arbiter meta learning technique was proposed for the parallel integration of multiple classifiers. Meta learning encompasses the use of learning algorithms to learn how to integrate results from multiple learning systems. The approach includes data reduction as a solution to the scaling ....
[Article contains additional citation context not shown here]
Chan, P.: An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD Thesis, Columbia University (1996)
....of actual fraud that is caught, FP stands for False Positive, i.e. percentage of false alarms 2. 1 Diversity Brodley [5] defines diversity by measuring the classification overlap of a pair of classifiers, i.e. the percentage of the instances classified the same way by two classifiers, while Chan [6] associates it with the entropy in the predictions of the base classifiers. When the predictions of the classifiers are distributed evenly across the possible classes, the entropy is higher and the set of classifiers more diverse. Kwok and Carter [13] correlate the error rates of a set of ....
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Columbia Univ., 1996.
....of instances for which at least one of the base classifiers produces the correct prediction. 2. 1 Diversity Brodley [5] defines diversity by measuring the classification overlap of a pair of classifiers, i.e. the percentage of the instances classified the same way by two classifiers while Chan [6] associates it with the entropy in the predictions of the base classifiers. When the predictions of the classifiers are distributed evenly across the possible classes, the entropy is higher and the set of classifiers more diverse. Kwok and Carter [16] correlate the error rates of a set of ....
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Department of Computer Science, Columbia University, New York, NY, 1996.
....the amount of main memory for many learning algorithms. However, recently researchers have proposed several parallel and distributed computing approaches that are particularly suitable for massive amount of data that mainmemory based learning algorithms cannot handle efficiently. For instance, in [2 4] it was shown that parallel and distributed processing provides the best hope of dealing with such large amounts of data. In this paper we consider an approach that uses classifiers trained on a number of data subsets in parallel and for each new instance to be classified selects the appropriate ....
....of multiple classifiers. In Section 4 we propose a combination of our dynamic classifier selection technique with the arbiter meta learning. Section 5 discusses various issues of the proposed technique and concludes shortly with further research directions. 2. Arbiter Meta Learning Technique In [2 4] the arbiter meta learning technique was proposed for a parallel integration of multiple classifiers. Metalearning encompasses the use of learning algorithms to learn how to integrate results from multiple learning systems. The proposed approach includes data reduction to solve the scaling problem ....
[Article contains additional citation context not shown here]
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Columbia University, 1996.
....of actual fraud that is caught, FP stands for False Positive, i.e. percentage of false alarms 2. 1 Diversity Brodley [5] defines diversity by measuring the classification overlap of a pair of classifiers, i.e. the percentage of the instances classified the same way by two classifiers, while Chan [6] associates it with the entropy in the predictions of the base classifiers. When the predictions of the classifiers are distributed evenly across the possible classes, the entropy is higher and the set of classifiers more diverse. Kwok and Carter [13] correlate the error rates of a set of ....
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Columbia Univ., 1996.
....p ik denotes the fraction of the baseclassifiers predicting the k th class for the i th instance. According to this definition, when the value of diversity grows, the predictions from the base classifiers are more evenly distributed (higher entropy) and, therefore, more diverse. In this study [6], Chan examined several characteristics of the base classifiers (i.e. diversity, coverage, correlated error and specialty) and explored the effects of these characteristics on the accuracy of the various integrating meta learning schemes. The results strengthened the belief that larger accuracy ....
....based on new metrics introduced here. Class specialty: The term class specialty defines a family of evaluation metrics that concentrate on the bias of a classifier towards certain classes. However, in this study, instead of calculating the combined specialty of the resulting meta classifiers [6], the class specialty metrics focus on the specialty of each (base ) classifier for each class. A classifier specializing in one class, should exhibit, for that class, both, a high True Positive (TP ) and a low False Positive (FP ) rate. The TP rate is a measure of how often the classifier ....
[Article contains additional citation context not shown here]
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Department of Computer Science, Columbia University, New York, NY, 1996.
....of instances for which at least one of the base classifiers produces the correct prediction. 3. 1 Diversity Brodley [6] defines diversity by measuring the classification overlap of a pair of classifiers, i.e. the percentage of the instances classified the same way by two classifiers while Chan [7] associates it with the entropy in the set of predictions of the base classifiers. When the predictions of the classifiers are distributed evenly across the possible classes, the entropy is higher and the set of classifiers is more diverse. Krogh and Vedelsby [17] measure the diversity, called ....
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Department of Computer Science, Columbia University, New York, NY, 1996.
....dataset is a medical database with records [11] noted by thyroid in the Data Set panel. Other parameters include the host of the CFM, the Cross Validation Fold, the Meta Learning Fold, the Meta Learning Level, the names of the local learning agent and the local meta learning agent, etc. Refer to [2] for more information on the meaning and use of these parameters. Notice that Marmalade has established that Strawberry and Mango are its peer Datasites, having acquired this information from the CFM. Then, Marmalade partitions the thyroid database (noted as thyroid.1.bld and thyroid. 2.bld in ....
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Department of Computer Science, Columbia University, New York, NY, 1996.
....of instances for which at least one of the base classifiers produces the correct prediction. 3. 1 Diversity Brodley [6] defines diversity by measuring the classification overlap of a pair of classifiers, i.e. the percentage of the instances classified the same way by two classifiers while Chan [7] associates it with the entropy in the set of predictions of the base classifiers. When the predictions of the classifiers are distributed evenly across the possible classes, the entropy is higher and the set of classifiers is more diverse. Krogh and Vedelsby [17] measure the diversity, called ....
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Department of Computer Science, Columbia University, New York, NY, 1996.
....future directions. 2 Serial Evaluation of Learning Algorithms To evaluate the ID3 [12] CART [1] BAYES [6] and CN2 [5] learning algorithms in a serial environment, we empirically investigate their speed with varying amounts of training data. Their theoretical time complexity is formulated in [3]. We performed a set of experiments using an artificial data set generated by a Boolean expression [3] with the training set size up to 10 million when all algorithms exceeded the main memory. We measure the elapsed training time of ID3, CART, BAYES, and CN2 with different numbers of examples in ....
....BAYES [6] and CN2 [5] learning algorithms in a serial environment, we empirically investigate their speed with varying amounts of training data. Their theoretical time complexity is formulated in [3] We performed a set of experiments using an artificial data set generated by a Boolean expression [3] with the training set size up to 10 million when all algorithms exceeded the main memory. We measure the elapsed training time of ID3, CART, BAYES, and CN2 with different numbers of examples in the artificial domain. The experiments were performed on HP 9000 735 workstations. The number of ....
[Article contains additional citation context not shown here]
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Department of Computer Science, Columbia University, New York, NY, 1996.
....the dataset is a medical database with records, noted by thyroid in the Data Set panel. Other parameters include the host of the CFM, the Cross Validation Fold, the Meta Learning Fold, the Meta Learning Level, the names of the local learning agent and the local meta learning agent, etc. Refer to [2] for more information on the meaning and use of these parameters. Notice that Marmalade has established that Strawberry and Mango are its peer Datasites, having acquired this information from the CFM. Then, Marmalade partitions the thyroid database (noted as thyroid.1.bld and thyroid. 2.bld in ....
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Department of Computer Science, Columbia University, New York, NY, 1996. (forthcoming).
....data are involved. 2. 2 Empirical time performance We performed two sets of experiments: the first set used the splice junction data (courtesy of Towell, Shavlik, and Noordewier [25] with training set size up to 100,000 examples, the second set used artificial data based on a Boolean concept 1 [6] with set size up to 10 million when all algorithms exceeded the main memory. We obtained ID3 [21] and CART [2] as part of the IND package [3] from NASA Ames Research Center and was implemented in C. CN2 [11] implemented in C was obtained from Dr. Clark [1] WPEBLS [12] and BAYES [13] as ....
....We next describe hierarchical meta learning on partitioned data. 3 Hierarchical Meta learning Hierarchical meta learning is designed to speed up the training process by partitioning a large data set into smaller subsets, learning from the subsets in parallel, and integrating the learned models [6]. Since much information is lost when models are learned from smaller subsets, prediction accuracy of hierarchical meta learning in an important concern. Our results in [10] indicate that the accuracy performance of hierarchical meta learning on partitioned data Classifier 1 Classifier 2 ....
P. Chan. An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD thesis, Department of Computer Science, Columbia University, New York, NY, 1996.
No context found.
P. Chan, An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD Dissertation, Dept of Computer Science, Columbia University, New York, 1996.
No context found.
P. Chan, An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD Dissertation, Dept of Computer Science, Columbia University, New York, 1996.
No context found.
P. Chan, "An extensible meta-learning approach for scalable and accurate inductive learning," Ph.D. dissertation, Department of Computer Science, Columbia University, New York, NY, 1996. [Online]. Available: http://www.cs.columbia.edu/ pkc/papers/thesis.ps
No context found.
P. Chan, An extensible meta-learning approach for scalable and accurate inductive learning. Dept. of Computer Science, Columbia University, New York, NY, PhD Thesis, 1996 (Technical Report CUCS-044-96).
No context found.
P. Chan, An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD Dissertation, Dept of Computer Science, Columbia University, New York, 1996.
No context found.
P. Chan, An Extensible Meta-Learning Approach for Scalable and Accurate Inductive Learning. PhD Dissertation, Dept of Computer Science, Columbia University, New York, 1996.
No context found.
Chan, P.: An extensible meta-learning approach for scalable and accurate inductive learning, Ph.D. Thesis, Columbia University, 1996.
First 50 documents
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC