47 citations found. Retrieving documents...
Jason Catlett. Megainduction: A test flight. In Machine Learning, pages 596--599, 1991.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Dynamic Discretization of Continuous Attributes - Gama, Torgo, Soares (1998)   (3 citations)  (Correct)

....=2oe . Several authors (see for instance [3, 7] note that this assumption is a very severe limitation of learning algorithms based on the Bayes formalism. The second motivation for performing attribute discretization is related to computational complexity. As it was mentioned by Catlett [1] and others, the performance of tree based learners is strongly conditioned by the sorting of continuous attributes values. This is an operation that on average takes O(n Log n) Using profiling tools Catlett observed on several large domains that most of the CPU time was spent on sorting. This ....

....on the vector [V 1 ; Vn ] The ascendent search works the same way as the descendent one, only in the opposite direction. On the example of the previous section, the state [4, 2, 6] will have three descendants: 6, 2, 6] 4, 4, 6] and [4, 2, 8] if it is on the ascendent branch, or [2, 2, 6] [4, 1, 6] and [4, 2, 4] if it is on the descendent branch. The search engine can skip from the ascendent branch to the descendent branch, because the next node to expand is the one not yet expanded and with lowest value of the objective function. 4 Empirical Evaluation The method was evaluated on 5 well ....

J. Catlett. Megainduction: a test flight. In Machine Learning: Proceedings of the 8 International Conference. Morgan Kaufmann, 1991.


Efficient Pruning Methods - For Separate-And-Conquer Rule   (Correct)

....of relatively modest size. We note that recent work has demonstrated that competing learning methods in particular decision tree induction methods scale quite well, even on noisy data; experiments have been performed in which tens or hundreds of thousands of examples were used for learning [ Catlett, 1991 ] Unfortunately, the rapid rate of growth shown in Figure 2 makes it impractical to use rule induction methods on large noisy datasets. 2.3 Alternative pruning methods In this paper, we will use the implementation of reduced error pruning described above as our strawman noisecorrection ....

....in both c and n, as is the case in our implementation. It is useful to keep it separate, however, because, in some special cases notably when rules for only a single class are learned caching techniques can be used to reduce evaluation cost. It is also possible that subsampling techniques [Catlett, 1991] could be used to estimate coverage, also reducing evaluation cost. least half of these, or 0:35n examples, will be examined in adding each condition. We thus conclude that the time complexity for the fitting phase is Time(fit(c; n) c Delta 0:35n = Omega Gamma cn) 1) Recall that the ....

Jason Catlett. Megainduction: a test flight. In Proceedings of the Eighth International Workshop on Machine Learning, Ithaca, New York, 1991. Morgan Kaufmann.


Distributed Data Mining Systems - Prodromidis (1999)   (Correct)

....spread (middle) and total savings (bottom) of the Chase (left) and First Union (right) classifiers as a function of the size of the training set. 54 among the best performers. The graphs show that larger training sets result in better classification models, thus verifying Catlett s results [ Catlett, 1991; 1992 ] pertaining to the negative impact of sampling on the accuracy of learning algorithms. On the other hand, they also show that performance curves converge, thus indicating reduced benefits as more data is used for learning. Increasing the amount of training data beyond a certain point may ....

Catlett, J. 1991. Megainduction: A test flight. In Proc. Eighth Intl. Work. Machine Learning, 596--599.


The Effect of Subsampling Rate on S3Bagging Performance - Terabe, Washio, Motoda (2001)   (Correct)

....are required. 0 200 400 600 800 1000 1200 0 20 40 60 80 100 Subsampling rate ( S3Bagging All Bagging Figure 12. The experimental result on waveform : the processing time (sec. 5. Related Work The e ect of the data size reduction by subsampling is investigated by Catlett[5]. In the experiments, the random sampling and the strati ed sampling are adopted. They focused on the problem that the C4.5 needs much processing time for inducing decision tree when the data set includes many continuous attributes. Thus, they con rmed that the processing time could be shortened ....

Catlett, J.: Megainduction: a Test Flight, In Proc. of the Eighth International Workshop on Machine Learning, pp.596-599. (1991).


Fast Algorithms for Mining Association Rules - Agrawal, Srikant (1994)   (881 citations)  (Correct)

.... of finding association rules falls within the purview of database mining [AIS93a] ABN92] HS94] MKKR92] S 93] Tsu90] also called knowledge discovery in databases [HCC92] Lub89] PS91b] Related, but not directly applicable, work includes the induction of classification rules [BFOS84] Cat91] FWD93] HCC92] Qui93] discovery of causal rules [CH92] Pea92] learning of logical definitions [MF92] Qui90] fitting of functions to data [LSBZ87] Sch90] and clustering [ANB92] C 88] Fis87] The closest work in the machine learning literature is the KID3 algorithm presented in ....

J. Catlett. Megainduction: A test flight. In 8th Int'l Conf. on Machine Learning, June 1991.


Constructing New Attributes for Decision Tree Learning - Zheng (1996)   (3 citations)  (Correct)

....areas of machine learning, supervised learning is relatively well developed and well understood [Shavlik and Dietterich, 1990, p. 1] Many applications of supervised classifier learning have been tried successfully on a variety of problems (e.g. Michalski and Chilausky, 1980b; Quinlan, 1983; Catlett, 1991c; Aha and Bankert, 1994; Michie et al. 1994] Langley and Simon [1995] give a review of some fielded applications. However, there are still plenty of unsolved problems in this area. 8 1.1.1 A fundamental problem of selective induction Conventional inductive learning algorithms usually ....

J. Catlett, Megainduction: a test flight. Proceedings of the Eighth International Workshop on Machine Learning, San Mateo, CA: Morgan Kaufmann, 596-599. 221


KDD-93: Progress and Challenges in Knowledge.. - Piatetsky-Shapiro, .. (1994)   (Correct)

....reduce the size of the data on which learning is performed. One way is to eliminate irrelevant data using the data dependencies. This method has been shown (Almuallim and Dietterich 1991) to increase the performance of the classifier methods. Other methods rely on various forms of data sampling. Catlett (1991) used an intelligent sampling approach to make a sublinear algorithm for decision tree induction. His method has been used to efficiently learn decision trees from databases with hundreds of thousands of records. Overall, the workshop reflected measurable progress in developing and deploying KDD ....

Catlett, J. 1991. Megainduction: A Test Flight. In Proceedings of the Eighth Machine Learning Conference, 596--599. San Mateo, Calif.: Morgan Kaufmann.


DAGGER:A New Approach to Combining Multiple Models Learned.. - Davies, Edwards (2000)   (6 citations)  (Correct)

....and combining multiple models. The earliest work known to the authors is by Gams (1989) in which he shows how several models may be combined by ordering the individual classifiers. Brazdil and Torgo (1990) is the earliest known investigation of voting as a method to combine multiple models. Catlett (1991) conducted an extensive analysis of existing approaches to sampling the available data, and concluded that sampling techniques, are not helpful in general, and concluded . advances in learning from large noisy databases may be more likely to come from more efficient processing of more ....

J. Catlett (1991). Megainduction: A test flight. In Proceedings of the Eighth International Workshop on Machine Learning, Morgan Kaufmann, pages 596-599, San Francisco.


DAGGER: Using Instance Selection to Combine Multiple Models.. - Davies, Edwards   (Correct)

....and combining multiple models. The earliest work known to the authors is by Gams (1989) in which he shows how several models may be combined by ordering the individual classifiers. Brazdil and Torgo (1990) is the earliest known investigation of voting as a method to combine multiple models. Catlett (1991) conducted an extensive analysis of existing approaches to sampling the available data, and concluded that sampling techniques, are not helpful in general, and concluded . advances in learning from large noisy databases may be more likely to come from more efficient processing of more ....

J. Catlett (1991). Megainduction: A test flight. In Proceedings of the Eighth International Workshop on Machine Learning, Morgan Kaufmann, pages 596-599, San Francisco.


Likelihood-based Data Squashing: A Modeling.. - Madigan..   (3 citations)  (Correct)

.... see, for example, Bradley et al. 1998) or Provost and Kolluri (1999) In this paper we focus on the alternative approach of scaling down the data. Most of the previous work in this direction has focused on sampling methods such as random sampling, stratified sampling, duplicate compaction (Catlett, 1991), and boundary sampling (Aha et al. 1991, Syed et al. 1999) Recently DuMouchel et al. 1999) DVJCP] proposed an approach that instead constructs a reduced dataset. Specifically their data squashing algorithm seeks to compress (or squash ) the data in such a way that a statistical analysis ....

Catlett, J. (1991). Megainduction: A test flight. In: Proceedings of the Eighth International Workshop on Machine Learning, 596--599.


Noise-Tolerant Windowing - Fürnkranz (1997)   (Correct)

....Candidates = Candidates [ Example Examples = NewEx Train = NewTr [ RandomSample(Examples,MaxIncSize) until Candidates = Figure 2: A noise tolerant version of windowing. structure that is inherent in the data. This hypothesis is consistent with the results of (Wirth Catlett, 1988) and (Catlett, 1991), where the sensitivity of windowing to noisy data sets has been shown empirically. 5 A Noise Tolerant Version of Windowing The windowing algorithm described in (Furnkranz, 1997) which is only applicable to noise free domains, is based on the observation that rule learning algorithms will ....

Catlett, J. (1991). Megainduction: A test flight. In Birnbaum, L., & Collins, G. (Eds.), Proceedings of the 8th International Workshop on Machine Learning (ML-91), pp. 596--599 Evanston, IL. Morgan Kaufmann.


Efficient Progressive Sampling - Provost, Jensen, Oates (1999)   (14 citations)  (Correct)

....of the model produced by an induction algorithm when given a training set of size n. Learning curves typically have a steeply sloping portion early in the curve, a more gently sloping middle portion, and a plateau late in the curve. The middle portion can be extremely large in some curves (e.g. [2, 3, 6]) and almost entirely missing in others. The plateau occurs when adding additional data instances does not improve accuracy. The plateau, and even the entire middle portion, can be missing from curves when N is not sufficiently large. Conversely, the plateau region can constitute the majority of ....

Catlett, J. Megainduction: A test flight. In Proceedings of the Eighth International Workshop on Machine Learning (1991), Morgan Kaufmann, pp. 596--599.


Learning with Non-uniform Class and Cost Distributions.. - Chan, Stolfo (1998)   (1 citation)  (Correct)

....360K 29.08 natural 21 315K 30.02 natural 19 278K Until recently, researchers in machine learning have been focused on small data sets. Efficiently learning from large amounts of data has been gaining attention due to the fast growing field of data mining, where data are abundant. Sampling (e.g. (Catlett 1991)) and parallelism (e.g. Han, Karypis, Kumar 1997; Provost Aronis 1996) are the two main directions in scalable learning. Much of the parallelism work focuses on parallelizing a particular algorithm on a particular parallel architecture. That is, a new algorithm or architecture requires ....

Catlett, J. 1991. Megainduction: A test flight. In Proc. Eighth Intl. Work. Machine Learning, 596--599.


An Extensible Meta-Learning Approach for Scalable and Accurate.. - Chan (1996)   (19 citations)  (Correct)

....so far are generally not scalable to large databases as envisaged by the Genome Project. The complexity of typical machine learning algorithms renders their use infeasible in problems with massive amounts of data (Chan Stolfo, 1993d) A more concrete testimony of the efficiency problem is from Catlett (1991), who projects that ID3 (Quinlan, 1986) a popular inductive learning algorithm) on modern machines will take several months to learn from a million records in the flight data set obtained from NASA, which is clearly unacceptable. Moreover, typical learning algorithms like ID3 rely on a monolithic ....

....data replacing some of the data in the old window until all the data are correctly classified. Wirth and Catlett (1988) show that the windowing technique does not significantly improve speed on reliable data. On the contrary, for noisy data, windowing considerably slows down the computation. Catlett (1991) demonstrates that larger amounts of data improves accuracy, but he projects that ID3 (Quinlan, 1986) on modern machines will take several months to learn from a million records in the flight data set obtained from NASA. Using data reduction techniques, Domingos (1996) significantly improves the ....

[Article contains additional citation context not shown here]

Catlett, J. (1991). Megainduction: A test flight. Proc. Eighth Intl. Work. Machine Learning (pp. 596--599).


Toward Scalable Learning with Non-uniform Class and Cost.. - Chan (1998)   (26 citations)  (Correct)

....learning process was not cost sensitive. Until recently, researchers in machine learning have been focused on small data sets. Efficiently learning from large amounts of data has been gaining attention due to the fast growing field of data mining, where data are abundant. Sampling (e.g. [4]) and parallelism (e.g. 13, 17] are the two main directions in scalable learning. Much of the parallelism work focuses on parallelizing a particular algorithm on a particular parallel architecture. That is, a new algorithm or architecture requires substantial amount of parallel programming ....

J. Catlett. Megainduction: A test flight. In Proc. Eighth Intl. Work. Machine Learning, pages 596--599, 1991.


Machine Learning for the Detection of Oil Spills in.. - Kubat, Holte, Matwin (1998)   (22 citations)  (Correct)

....limited time of the expert, the data available is restricted by financial considerations: images cost hundreds, sometimes thousands of dollars each. We currently have 9 carefully selected images containing a total of 41 oil slicks. While many applications work with large amounts of available data (Catlett, 1991), our domain application is certainly not unique in its data scarcity. For example, in the drug activity application reported by Dietterich, Lathrop and Lozano Perez (1997) the two datasets contain 47 and 39 positive examples respectively. The second critical feature of the oil spill domain can be ....

....As early as the late sixties, Hart (1968) presented a mechanism that removes re DETECTION OF OIL SPILLS 13 dundant examples and, somewhat later, Tomek (1976) introduced a simple method to detect borderline and noisy examples. In machine learning the best known sampling technique is windowing (Catlett, 1991). For more recent alternatives, see, for instance, Aha, Kibler and Albert (1991) Zhang (1992) Skalak (1994) Floyd and Warmuth (1995) and Lewis and Catlett (1994) Variations of data reduction techniques, namely those that remove only negative examples, are analyzed by Kubat and Matwin (1997) ....

Catlett, J. (1991). Megainduction: A Test Flight. Proceedings of the Eighth International Workshop on Machine Learning (pp. 596--599), Morgan Kaufmann.


The Effect of Numeric Features on the Scalability of.. - Georgios Paliouras (1995)   (1 citation)  (Correct)

....final concept description is determined only by the nature of the problem, not the feature types nor the number of examples. In a worst case situation, however, it will be shown that the size of the concept description depends on the size of the training set. There are also some empirical results [1] which observe this dependency also in average case situations. 2.2 Specialisation Algorithms The two specialisation algorithms examined here, i.e. C4.5 and PLS1, behave in a very similar way to the basic ID3 algorithm, which is described in Figure 1. Thus, ID3 can be used to derive an ....

J. Catlett. Megainduction: a test flight. In Proceedings of the Eighth International Workshop in Machine Learning, pages 596--599, 1991.


Scalability of Learning Arbiter and Combiner Trees from.. - Chan, Stolfo   (Correct)

....by ID3, CART, and CN2. Crossovers among ID3, CART, and BAYES occur between 100,000 and 1 million examples. With 5 million examples, CART was faster than ID3 and BAYES was the slowest. ID3 completed processing 5 million records in about 2,800 seconds (47 minutes) which is much less than Catlett s [2] projection of several months for ID3 to process 1 million records. The huge gap merits some explanation. First, the projection was made five years ago, state of the art processor speed has much improved since then. Second, the artificial data set has only eight attributes, four of which are ....

J. Catlett. Megainduction: A test flight. In Proc. Eighth Intl. Work. Machine Learning, pages 596--599, 1991.


Rule Induction and Instance-Based Learning: A Unified Approach - Domingos (1995)   (25 citations)  (Correct)

....and splice junctions, where RISE took respectively 119 minutes and 20 minutes. RISE has not been optimized, however, and several important components of the system are amenable to such optimization. Beyond that, windowing and other sampling techniques can be used without expected loss in accuracy [ Catlett, 1991 ] Also, even though RISE s memory cost is much smaller than that of a simple nearest neighbor classifier, the rule sets it produces are not as compact as those output by C4.5 or CN2. RISE s greater costs will generally be a price well worth paying for the additional accuracy obtained. However, ....

J. Catlett. Megainduction: A test flight. In Proc. 8th Machine Learning Conf., pages 589--604, 1991.


Bagging, Boosting, and C4.5 - Quinlan (1996)   (Correct)

.... classifiers, examples of methods that improve accuracy are: ffl Construction of multi attribute tests using logical combinations (Ragavan and Rendell 1993) arithmetic combinations (Utgoff and Brodley 1990; 1 For extremely large datasets, however, learning time can remain the dominant issue (Catlett 1991; Chan and Stolfo 1995) Heath, Kasif, and Salzberg 1993) and counting operations (Murphy and Pazzani 1991; Zheng 1995) ffl Use of error correcting codes when there are more than two classes (Dietterich and Bakiri 1995) ffl Decision trees that incorporate classifiers of other kinds (Brodley ....

Catlett, J. 1991. Megainduction: a test flight. In Proceedings 8th International Workshop on Machine Learning, 596-599. San Francisco: Morgan Kaufmann.


Heterogeneous Uncertainty Sampling for Supervised Learning - Lewis, Catlett (1994)   (70 citations)  Self-citation (Catlett)   (Correct)

....chosen by this heterogeneous approach, the uncertainty samples yielded classifiers with lower error rates than random samples ten times larger. 1 Introduction Machine learning algorithms have been used to build classification rules from data sets consisting of hundreds of thousands of instances [4]. In some applications unlabeled training instances are abundant but the cost of labeling an instance with its class is high. In the information retrieval application described here the class labels are assigned by a human, but they could also be assigned by a computer simulation [2] or a ....

....version of propositional logic. Because decision rules [27, 34] can be converted into this form (unlike probabilistic models requiring arithmetic) they make a good choice for the final classifier. Another important advantage is that they can more comprehensible to humans than decision trees [4]. Our databases contain hundreds of thousands of unlabeled instances, so uncertainty sampling is a natural approach. However, as discussed in Section 5, our current decision rule induction software cannot practicably be used for uncertainty sampling from large text databases. We therefore decided ....

J. Catlett. Megainduction: a test flight. In Machine Learning: Proceedings of the Eigth International Workshop, pages 596--599, San Mateo, CA, 1991. Morgan Kaufmann.


Efficient Progressive Sampling for Association Rules - Parthasarathy (2002)   (Correct)

No context found.

Jason Catlett. Megainduction: A test flight. In Machine Learning, pages 596--599, 1991.


Scaling Up Inductive Learning with Massive Parallelism - Provost, Aronis   (11 citations)  (Correct)

No context found.

Catlett, J. (1991b). Megainduction: A test flight. Proceedings of the Eighth International Workshop on Machine Learning (pp. 596--599). San Mateo, CA: Morgan Kaufmann.


A Survey of Methods for Scaling Up Inductive Algorithms - Provost, Kolluri (1999)   (31 citations)  (Correct)

No context found.

Catlett, J. (1991a). Megainduction: A test flight. In Proceedings of the Eighth International Workshop on Machine Learning, pp. 596--599. Morgan Kaufmann.


Toward Scalable Learning with Non-uniform Class and Cost.. - Chan, Stolfo (1998)   (26 citations)  (Correct)

No context found.

Learning, 57--65. Catlett, J. 1991. Megainduction: A test flight. In Proc. Eighth Intl. Work. Machine Learning, 596--599.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC