28 citations found. Retrieving documents...
William W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. AT&T Bell Labs Technical Memorandum. Available from the author on request, 1992.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Constructing New Attributes for Decision Tree Learning - Zheng (1996)   (3 citations)  (Correct)

....represented as a set of rules. The antecedent of each rule is a conjunction. As for decision tree learning, pruning can be performed after a rule or a ruleset is generated by examining whether some conditions of a rule or some rules of the ruleset can be deleted based on heuristic functions [Cohen, 1993]. The well known covering algorithm A q is the basis of the AQ family of systems such as AQ11 [Michalski and Chilausky, 1980b] AQ15 [Michalski et al. 1986] and AQR [Clark and Niblett, 1989] The rules learned by A q are in the form of VL1 (Variablevalued Logic 1) which is a ....

....for execution if the dataset is not small, because an algorithm needs to be run n times where n is equal to the dataset size. Here, we use 10 fold cross validation, a very commonly used method [Towell, Craven, and Shavlik, 1991; Brodley and Utgoff, 1992; Quinlan, 1993b; Ragavan and Rendell, 1993; Cohen, 1993; Aha and Bankert, 1994; Furnkranz and Widmer, 1994; Dietterich and Bakiri, 1995; Kohavi and Li, 1995] Recent experimental and theoretical results have shown that 10 fold cross validation is a good method for accuracy estimation, compared with other methods such as leave one out and bootstrap ....

W.W Cohen, Efficient pruning methods for separate-and-conquer rule learning systems. Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, San Mateo, CA: Morgan Kaufmann, 988-994.


Identifying Discourse Markers in Spoken Dialog - Heeman, Byron, Allen (1998)   (2 citations)  (Correct)

....computed using Church s part of speech tagger (1988) This gives them a recall rate of 39.0 and a precision of 55.2 . Litman (1996) explored using machine learning techniques to automatically learn classification rules for discourse markers. She contrasted the performance of CGRENDEL (Cohen 1992; 1993) with C4.5 (Quinlan 1993) CGRENDEL is a learning algorithm that learns an ordered set of if then rules that map a condition to its mostlikely event (in this case discourse or sentential interpretation of potential discourse marker) C4.5 is a decision tree growing algorithm that learns a ....

Cohen, W. W. 1993. Efficient pruning methods for separate-andconquer rule learning systems. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI '93).


A Survey of Methods for Scaling Up Inductive Algorithms - Provost, Kolluri (1999)   (31 citations)  (Correct)

....pruning systems generally do not scale well. For example, the rule learning variant of C4.5, C4.5rules, has been reported sometimes to require O(e 3 ) time (Cohen 1995) Domingos 1996b) Some algorithms effective at finding high accuracy rule sets have O(e 4 ) time complexity in noisy domains (Cohen 1993). Kufrin (1997) describes speeding up C4.5rules with parallel processing, which we discuss later. Furnkranz and Widmer (1994) show, with their incremental reduced error pruning (IREP) algorithm, that significant speedups can be obtained by pruning each rule as it is learned and then applying a ....

Cohen, W. W. (1993). Efficient pruning methods for separate-and-conquer rule learning systems. In Thirteenth International Joint Conference on Artificial Intelligence, pp. 988--994. Morgan Kaufmann.


Top-Down Pruning in Relational Learning - Fürnkranz (1994)   (Correct)

....on one part of the training instances and to subsequently delete several parts of this theory in order to improve performance on the remaining set. The most prominent use of this method in ILP is the adaptation of Reduced Error Pruning [Brunk and Pazzani, 1991] However, it has been shown in [Cohen, 1993] that REP can be very inefficient, because most of the time is wasted for generating clauses that explain noisy examples and subsequently pruning these clauses. We solve this problem by adapting the relational learning algorithm Fossil to combine pre pruning and post pruning by first performing a ....

....The resulting concept is then generalized by deleting literals and clauses from the theory until all possible deletions would result in a decrease of predictive accuracy, measured on the pruning set. While this method proved to be very effective in avoiding noise fitting in several domains, [Cohen, 1993] has shown that REP is a very costly process. Its time complexity on random data is as bad as Omega Gamma n 2 log n) for generating a concept description from n examples and Omega Gamma n 4 log n) for pruning the resulting set of rules. Cohen, 1993] has then suggested a more efficient ....

[Article contains additional citation context not shown here]

William W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pages 988--994, Chambery, France, 1993.


A Comparison of Pruning Methods for Relational Concept Learning - Fürnkranz (1994)   (Correct)

....that has been learned from the growing set is then simplified by deleting conditions and rules from the theory until any further deletion would result in a decrease of predictive accuracy measured on the pruning set. However, this approach has several disadvantages, most notably efficiency. [Cohen, 1993] has shown that REP has a time complexity of Omega Gamma n 4 ) on purely random data. Therefore [Cohen, 1993] proposed Grow, a new pruning algorithm based on a technique used in the Grove learning system [Pagallo and Haussler, 1990] Like REP, Grow first finds a theory that overfits the data. ....

....until any further deletion would result in a decrease of predictive accuracy measured on the pruning set. However, this approach has several disadvantages, most notably efficiency. Cohen, 1993] has shown that REP has a time complexity of Omega Gamma n 4 ) on purely random data. Therefore [Cohen, 1993] proposed Grow, a new pruning algorithm based on a technique used in the Grove learning system [Pagallo and Haussler, 1990] Like REP, Grow first finds a theory that overfits the data. But instead of pruning the intermediate theory until any further deletion results in a decrease in accuracy on ....

[Article contains additional citation context not shown here]

William W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pages 988--994, Chambery, France, 1993.


Pruning Algorithms for Rule Learning - Fürnkranz (1997)   (Correct)

....Figure 1 shows a schematic depiction of this process. The quality of the found rules and conditions is commonly evaluated on a separate set of training examples that have not been seen during learning. Post pruning algorithms include Reduced Error Pruning (REP) Brunk and Pazzani 1991) and Grow (Cohen 1993). Both have been shown to be very effective in noise handling. However, they are also inefficient, because they waste time by learning an overfitting concept description and subsequently pruning a significant portion of its rules and conditions. One remedy for this problem is to combine pre and ....

....REP is quite effective in raising predictive accuracy in noisy domains (Brunk and Pazzani 1991) it has several shortcomings, which we will discuss in this section. In particular we will suggest that post pruning is incompatible with the separate and conquer learning strategy. Efficiency In (Cohen 1993) it was shown that the worst case time complexity of REP is as bad as Omega Gamma n 4 ) on random data (n is the number of examples) The growing of the initial concept, on the other hand, is only Omega Gamma n 2 log n) Therefore in the long run the costs of pruning will by far outweigh ....

[Article contains additional citation context not shown here]

Cohen, W. W. (1993). Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, France, pp. 988--994.


On Growing Better Decision Trees from Data - Murthy (1997)   (17 citations)  (Correct)

....suggested a pruning method that is based on viewing the decision tree as an encoding for the training data. Use of dynamic programming to prune trees optimally and efficiently has been explored recently in [33] A few studies have been done to study the relative effectiveness of pruning methods [324, 91, 125]. Just as in the case of splitting criteria, no single pruning method has been adjudged to be superior to the others. The choice of a pruning method depends on the size of the training set, availability of extra data for pruning etc. 2.5 Other issues Tree construction involves many issues other ....

W.W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. In IJCAI-93 [221], pages 988--994. Editor: Ruzena Bajcsy.


Pruning Methods for Rule Learning Algorithms - Fürnkranz (1994)   (Correct)

....handling in relational rule learning algorithms have been proposed. The classic approaches to pruning are based on pre pruning (Foil [Quinlan, 1990] mFoil [Dzeroski and Bratko, 1992] or Fossil [Furnkranz, 1994b] and post pruning (Reduced Error Pruning (REP) Brunk and Pazzani, 1991] and Grow [Cohen, 1993]) More recently approaches have been proposed that combine (MDL Grow [Cohen, 1993] and Top Down Pruning (TDP) Furnkranz, 1994c] and integrate (Incremental Reduced Error Pruning (I REP) Furnkranz and Widmer, 1994] these two basic methods. We will present and discuss a variety of these pruning ....

.... to pruning are based on pre pruning (Foil [Quinlan, 1990] mFoil [Dzeroski and Bratko, 1992] or Fossil [Furnkranz, 1994b] and post pruning (Reduced Error Pruning (REP) Brunk and Pazzani, 1991] and Grow [Cohen, 1993] More recently approaches have been proposed that combine (MDL Grow [Cohen, 1993] and Top Down Pruning (TDP) Furnkranz, 1994c] and integrate (Incremental Reduced Error Pruning (I REP) Furnkranz and Widmer, 1994] these two basic methods. We will present and discuss a variety of these pruning algorithms in section 2 and in particular show how they are related to each other. ....

[Article contains additional citation context not shown here]

William W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pages 988--994, Chambery, France, 1993.


Classifying Cue Phrases in Text and Speech Using Machine Learning - Litman (1994)   (14 citations)  (Correct)

....the data to construct rules that best predicted the classifications from the features. This paper examines the utility of machine learning for automating the construction of rules for classifying cue phrases. A set of experiments are conducted that use two machine learning programs, cgrendel (Cohen 1992; 1993) and C4.5 (Quinlan 1986; 1987) to induce classification rules from sets of preclassified cue phrases and their features. To support a quantitative and comparative evaluation of the automated and manual approaches, both the error rates and the content of the manually derived and learned rulesets ....

....class in the corpus (sentential) has an error rate of 39 and 41 for the classifiable tokens and the classifiable nonconjuncts, respectively. Experiments using Machine Induction This section describes experiments that use the machine learning programs C4.5 (Quinlan 1986; 1987) and cgrendel (Cohen 1992; 1993) to automatically in duce cue phrase classification rules from both the data of (Hirschberg Litman 1993) and an extension of this data. cgrendel and C4.5 are similar to each other and to other learning methods (e.g. neural networks) in that they induce rules from preclassified examples. Each ....

[Article contains additional citation context not shown here]

Cohen, W. W. 1993. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence.


Discovering Robust Knowledge from Databases that Change - Hsu (1998)   (1 citation)  (Correct)

....are interpreted using the closed world assumption (CWA) which states that information not explicitly present in the database is taken to be false. For a Horn clause rule C A, its predictive accuracy is usually defined as the conditional probability P r(CjA) given a randomly chosen data instance (Cohen, 1993, Cussens, 1993, Furnkranz and Widmer, 1994, Lavrac and Dzeroski, 1994) In other words, it concerns the probability that the rule is valid with regard to newly inserted data. However, databases also change by deletions or updates, and in a database under the CWA, these changes may also affect ....

....to generate overly specific rules, but taking the length and robustness of rules into account in rule construction could be too expensive. This is because the search space of rule construction is already huge and evaluating robustness is not trivial. Previous work in classification rule induction (Cohen, 1993, Furnkranz and Widmer, 1994) also shows that dividing a learning process into a two stage rule construction and rule pruning can yield better results in terms of classification accuracy as well as the efficiency of learning. Another example of rule pruning is the speedup learning system ....

[Article contains additional citation context not shown here]

Cohen, W. W. 1993. Efficient pruning methods for separate-and-conquer rule learning systems.


Learning Effective And Robust Knowledge For Semantic Query.. - Hsu (1997)   (1 citation)  (Correct)

....instead of with an entire database state. This difference is significant in databases that are interpreted using the closed world assumption (CWA) For a Horn clause rule C A, predictive accuracy is usually defined as the conditional probability P r(CjA) given a randomly chosen data instance [Cohen, 1993, Cohen, 1995b, Cussens, 1993, Furnkranz and Widmer, 1994, Lavrac and Dzeroski, 1994] In other words, it concerns the probability that the rule is valid with regard to a newly inserted data. However, databases also change by updates and deletions, and in a closed world database they may affect ....

....Measures Robustness and predictive accuracy are closely related. For a Horn clause rule C A, predictive accuracy is usually referred to as the conditional probability P r(CjA) given a randomly chosen data instance [Cussens, 1993, Furnkranz and Widmer, 1994, Lavrac and Dzeroski, 1994, Cohen, 1993, Cohen, 1995b] In other words, it concerns the probability that the rule is valid with regard to a newly R1 R2 R3 R4 Robustness 0.9996 0.9482 0.9967 0.9847 (w o log) Prob. of 0.9802 0.0699 0.8476 0.4626 consistency Robustness 0.9924 0.9871 0.9683 0.9746 (w o log) Prob. of 0.6829 0.5225 0.1998 ....

[Article contains additional citation context not shown here]

William W. Cohen. Efficient pruning methods for separate-andconquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence(IJCAI-93), Chambery, France, 1993.


Discovering Robust Knowledge from Dynamic Closed-World Data - Hsu, Knoblock (1996)   (2 citations)  (Correct)

.... R212 200000. Table 1: Schema and rules of an example database is significant in databases that are interpreted using the closed world assumption. For a Horn clause rule C A, predictive accuracy is usually defined as the conditional probability P r(CjA) given a randomly chosen data instance (Cohen 1993; 1995; Cussens 1993; Furnkranz Widmer 1994; Lavrac Dzeroski 1994) In other words, it concerns the probability that the rule is valid with regard to a newly inserted data. However, databases also change by updates and deletions, and in a closed world database, they may affect the validity of ....

....to generate overly specific rules, but taking the length and robustness of rules into account in rule construction could be too expensive. This is because the search space of rule construction is already huge and evaluating robustness is not trivial. Previous work in classification rule induction (Cohen 1993; 1995; Furnkranz Widmer 1994) shows that dividing a learning process into a two stage rule construction and pruning can yield better results in terms of classification accuracy as well as the efficiency of learning. These results may not apply directly to our rule discovery problem, ....

Cohen, W. W. 1993. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence(IJCAI-93).


A Tight Integration of Pruning and Learning - Fürnkranz (1995)   (1 citation)  (Correct)

....inefficient, because the overfitting theory it generates in its first pass can be much more complex than the final theory that is left after the post pruning phase. A lot of work is wasted in learning and subsequently pruning superfluous literals and clauses. This argument has been formalized in (Cohen 1993), where it was shown that the growing phase of REP has a time procedure REP(Examples, SplitRatio) SplitExamples(SplitRatio, Examples, GrowingSet, PruningSet) Theory = SeparateAndConquer(GrowingSet) loop NewTheory = SimplifyTheory(Theory,PruningSet) if Accuracy(NewTheory,PruningSet) ....

....of (Furnkranz and Widmer 1994) where we estimated I REP to have a time complexity of O(n log 2 n) on random data. Thus both algorithms are significantly faster than the initial overfitting phase of post pruning algorithms which typically has a time complexity of Omega Gamma n 2 log n) (Cohen 1993) (see also the first column of table 2) Asymptotically I 2 REP even seems to be a little more efficient than I REP, as can be expected from the fact that it does not have to learn overfitting clauses. However, the presented results do not allow this conclusion. Our results also confirm that ....

[Article contains additional citation context not shown here]

Cohen, W. W. (1993). Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, France, pp. 988--994.


Top-Down Pruning in Relational Learning - Fürnkranz (1994)   (Correct)

....description on one part of the training instances and to subsequently delete several parts of this theory in order to improve performance on the remaining set. The most prominent use of this method in ILP is the adaptation [3] of Reduced Error Pruning (REP) 17] However, it has been shown in [4] that REP can be very inefficient, because most of the time is wasted for generating clauses that explain noisy examples and subsequently pruning these clauses. We attempt to solve this problem by adapting the relational learning algorithm Fossil to combine pre pruning and post pruning by first ....

....The resulting concept is then generalized by deleting literals and clauses from the theory until all possible deletions would result in a decrease of predictive accuracy, measured on the pruning set. While this method proved to be very effective in avoiding noise fitting in several domains, [4] has shown that REP is a very costly process. Its time complexity on random data is as bad as Omega Gamma n 2 log n) for generating a concept description from n examples and Omega Gamma n 4 log n) for pruning the resulting set of rules. 4] has then suggested a more efficient pruning ....

[Article contains additional citation context not shown here]

William W. Cohen, `Efficient pruning methods for separateand -conquer rule learning systems', in Proceedings of the 13th International Joint Conference on Artificial Intelligence, pp. 988--994, Chambery, France, (1993).


The Computational Processing of Intonational Prominence: A.. - Nakatani (1997)   (2 citations)  (Correct)

....first implemented rule based system whose performance, in terms of computational efficiency and accuracy, rivals that of the well studied decision tree systems. In machine learning experiments on cue phrase disambiguation, Litman (1996) utilized both C4.5 and a precursor to RIPPER named CGRENDEL (Cohen, 1993) and reported comparable performance for the two systems. Cohen (1995) tested RIPPER and C4.5 on a suite of benchmark learning problems, and reported that with 93 confidence, the probability is 0.5 that RIPPER s measured error rate will be less than or equal to that of C4.5rules, which is a ....

Cohen, William A. 1993. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the Third International Joint Conference on Artificial Intelligence.


Automatic Construction of Decision Trees from Data: A.. - Murthy (1997)   (37 citations)  (Correct)

.... that is based on viewing the decision tree as an encoding for the training data was suggested by Forsyth et al. 112] Use of dynamic programming to prune trees optimally and efficiently has been explored in [25] A few studies have been done to study the relative effectiveness of pruning methods [247, 62, 91]. Just as in the case of splitting criteria, no single ad hoc pruning method has been adjudged to be superior to the others. The choice of a pruning method depends on factors such as the size of the training set and availability of additional data for pruning. 5. Other issues Tree construction ....

W.W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. In IJCAI-93 [160], pages 988--994. Editor: Ruzena Bajcsy.


Inductive Logic Programming - Fürnkranz   (Correct)

....regularities of the domain are discarded. The quality of the found clauses and literals is commonly evaluated on a separate set of training examples that have not been seen during learning. We review two common post pruning methods, Reduced Error Pruning (REP) Brunk and Pazzani 1991) and Grow (Cohen 1993), a variant that uses a top down search. While both methods prove to be very effective in noise handling, they are also very inefficient. One of the reasons for the inefficiency of post pruning methods is that the intermediate theory resulting from the initial overfitting phase can be much more ....

Cohen, W. W. (1993). Efficient pruning methods for separate-and-conquer rule learning systems.


Incremental Reduced Error Pruning - Fürnkranz, Widmer (1994)   (2 citations)  (Correct)

....generalized by deleting literals and clauses from the theory until any further deletion would result in a decrease of predictive accuracy measured on the pruning set. However, this approach has several disadvantages, which we will highlight in section 2. Section 3 briefly presents the approach of [Cohen, 1993] designed to solve some of these problems. In section 4 we propose Incremental Reduced Error Pruning a method that integrates pre and postpruning as an alternative solution. Section 5 then reports some experiments with two versions of this algorithm. 2 SOME PROBLEMS WITH REDUCED ERROR ....

....algorithm. 2 SOME PROBLEMS WITH REDUCED ERROR PRUNING Reduced Error Pruning (REP) Brunk and Pazzani, 1991] has proven to be quite effective in raising predictive accuracy in noisy domains. However, this method has several shortcomings, which we will discuss in this section. 2. 1 EFFICIENCY In [Cohen, 1993] it was shown that the worst case time complexity of REP is as bad as W(n 4 ) on random data (n is the number of examples) The growing of the initial concept, on the other hand, is only W(n 2 log n) The derivation of these numbers as given in [Cohen, 1993] rests on the assumption that for ....

[Article contains additional citation context not shown here]

William W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pages 988--994, Chambery, France, 1993.


Efficient Pruning Methods for Relational Learning - Fürnkranz (1994)   (Correct)

....1991] argue that this more efficient operator is suitable for separate and conquer rule learning algorithms, because the order in which the literals of a clause are considered for pruning is inverse to the order in which they have been learned. delete last sequence: This operator, used e.g. in [Cohen, 1993], selects the best of all theories that result from deleting a sequence of literals from the end of the clause. Each iteration is equally expensive as with the delete any literal operator, but one may arrive faster at the final theory, because this operator can prune several literals in the same ....

....e.g. the Unpruned column for size 750 actually contains results for learning an unpruned theory from about 500 examples. Comparing these values to the values of the other algorithms shows that pruning is necessary in this domain. 4.2. PROBLEMS WITH REDUCED ERROR PRUNING 42 4.2. 1 Efficiency In [Cohen, 1993] it was shown that the worst case time complexity of REP is as bad as Omega Gamma n 4 ) on random data (n is the number of examples) The growing of the initial concept, on the other hand, is only Omega Gamma n 2 log n) The derivation of these numbers as given in [Cohen, 1993] rests on the ....

[Article contains additional citation context not shown here]

William W. Cohen. Efficient pruning methods for separate-andconquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pages 988--994, Chambery, France, 1993.


Discovering Robust Knowledge from Databases that Change - Hsu, Knoblock (1998)   (1 citation)  (Correct)

....that are interpreted using the closed world assumption(CWA) That is, information not explicitly present in the database is taken to be false. For a Hornclause rule C A, its predictive accuracy is usually defined as the conditional probability Pr(CjA) given a randomly chosen data instance [ Cohen, 1993, Cohen, 1995, Cussens, 1993, Furnkranz and Widmer, 1994, Lavrac and Dzeroski, 1994 ] In other words, it concerns the probability that the rule is valid with regard to a newly inserted data. However, databases also change by deletions or updates, and in a closed world database, they may affect ....

....generate overly specific rules, but taking the length and robustness of rules into account in rule construction could be too expensive. This is because the search space of rule construction is already huge and evaluating robustness is not trivial. Previous work in classification rule induction [ Cohen, 1993, Cohen, 1995, Furnkranz and Widmer, 1994 ] also shows that dividing a learning process into a two stage rule construction and rule pruning can yield better results in terms of classification accuracy as well as the efficiency of learning. Another example of rule pruning is the speedup learning ....

[Article contains additional citation context not shown here]

William W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence(IJCAI-93), Chambery, France, 1993.


Incremental Reduced Error Pruning - Fürnkranz, Widmer (1994)   (2 citations)  (Correct)

....generalized by deleting literals and clauses from the theory until any further deletion would result in a decrease of predictive accuracy measured on the pruning set. However, this approach has several disadvantages, which we will highlight in section 2. Section 3 briefly presents the approach of [Cohen, 1993] designed to solve some of these problems. In section 4 we propose Incremental Reduced Error Pruning a method that integrates pre and post pruning as an alternative solution. Section 5 then reports some experiments with two versions of this algorithm. 2 Some Problems with Reduced Error ....

....algorithm. 2 Some Problems with Reduced Error Pruning Reduced Error Pruning (REP) Brunk and Pazzani, 1991] has proven to be quite effective in raising predictive accuracy in noisy domains. However, this method has several shortcomings, which we will discuss in this section. 2. 1 Efficiency In [Cohen, 1993] it was shown that the worst case time complexity of REP is as bad as Omega Gamma n 4 ) on random data (n is the number of examples) The growing of the initial concept, on the other hand, is only Omega Gamma n 2 log n) The derivation of these numbers as given in [Cohen, 1993] rests on the ....

[Article contains additional citation context not shown here]

William W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pages 988--994, Chambery, France, 1993.


Linear-Time Rule Induction - Domingos   (10 citations)  (Correct)

.... has a large negative impact on windowing, a technique often used to speed up C4.5 C4.5RULES for large datasets (Catlett 1991) In algorithms that use reduced error pruning as the simplification technique (Brunk Pazzani 1991) the presence of noise causes running time to become O(e 4 log e) (Cohen 1993). Furnkranz and Widmer (1994) have proposed incremental reduced error pruning (IREP) an algorithm that prunes each rule immediately after it is grown, instead of waiting until the whole rule set has been induced. Assuming the final rule set is of constant size, IREP reduces running time to O(e ....

Cohen, W. W. 1993. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, 988--994. Chambery, France: Morgan Kaufmann.


Irrelevant Features and the Subset Selection Problem - John, Kohavi, Pfleger (1994)   (270 citations)  (Correct)

....The above heuristic increases the overall running time of the black box induction algorithm by a multiplicative factor of O(m 2 ) in the worst case, where m is the number of features. While this may be impractical in some situations, it does not depend on n, the number of instances. As noted in Cohen (1993) , divide and conquer systems need much more time for pruning than for growing the structure (by a factor of O(n 2 ) for random data) By pruning after feature subset selection, pruning may be much faster. 4 EXPERIMENTAL RESULTS In order to evaluate the feature subset selection using the ....

Cohen, W. W. 1993. Efficient pruning methods for separate-and-conquer rule learning systems. In 13th International Joint Conference on Artificial Intelligence, 988--994. Morgan Kaufmann.


A Tight Integration of Pruning and Learning (Extended Abstract) - Fürnkranz   (Correct)

....inefficient, because the overfitting theory it generates in its first pass can be much more complex than the final theory that is left after the post pruning phase. A lot of work is wasted in learning and subsequently pruning superfluous literals and clauses. This argument has been formalized in [Cohen, 1993], where it was shown that the growing phase of REP has a time complexity of Omega (n 2 log n) and that its pruning phase has a time complexity of Omega (n 4 ) where n is the size of the training set) Furnkranz and Widmer, 1994] point out another problem with REP that is caused by the ....

William W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, pages 988--994, Chambery, France, 1993.


Efficient Pruning Methods - For Separate-And-Conquer Rule   Self-citation (Cohen)   (Correct)

....3 23 6 3772 Table 1: Domains used in testing criterion would improve runtime performance without adversely impacting generalization performance. Space limitations preclude a detailed description of these heuristics. The interested reader is referred to the expanded version of this paper [ Cohen, 1992b ] 5 Experimental Results To summarize, we have argued that reduced error pruning is expensive for large, noisy datasets, and proposed two alternative techniques: a simplification algorithm that has a lower asymptotic complexity on noisy data, and an MDL based prepruning strategy which reduces ....

....ten times faster in the prospective test. We also undertook somewhat more detailed studies of CPU time performance on selected benchmarks. Figure 3 shows the results for the rds dataset, which is typical (similar results for three other benchmarks are given in the expanded version of this paper [ Cohen, 1992b ] These results serve as additional confirmation of the main claim of the paper: that the runtime of Grow and MDLGrow is asymptotically faster than reduced error pruning. Weighting the bridge problems as one, the average error rate is 10.97 for reduced error pruning, 10.37 for Grow, and ....

William W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. AT&T Bell Labs Technical Memorandum. Available from the author on request, 1992.


Context-Sensitive Learning Methods for Text Categorization - Cohen, Singer (1998)   (98 citations)  Self-citation (Cohen)   (Correct)

....of two main stages. The first stage is a greedy process which constructs an initial ruleset. This stage is based on an earlier rule learning algorithm called incremental reduced error pruning (IREP) F urnkranz and Widmer, 1994] which in turn is based on earlier work due to Quinlan [1990] Cohen [1993] , Brunk and Pazzani [1991] and Pagallo and Haussler [1990] The second stage is an optimization phase which attempts to further improve the compactness and accuracy of the ruleset. Stage 1: Building an initial ruleset. The first stage of RIPPER is a variant of IREP that we call IREP. IREP is ....

William W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, France, 1993.


Fast Effective Rule Induction - Cohen (1995)   (231 citations)  Self-citation (Cohen)   (Correct)

.... and Cameron Jones, 1993 ] Certain types of prior knowledge can also be easily communicated to rule learning systems [ Cohen, 1994; Pazzani and Kibler, 1992 ] One weakness with rule learning systems is that they often scale relatively poorly with the sample size, particularly on noisy data [ Cohen, 1993 ] Given the prevalence of large noisy datasets in real world applications, this problem is of critical importance. The goal of this paper is to develop propositional rule learning algorithms that perform efficiently on large noisy datasets, that extend naturally to first order representations, ....

....reduction of error on the pruning set. Simplification ends when applying any pruning operator would increase error on the pruning set. REP for rules usually does improve generalization performance on noisy data [ Pagallo and Haussler, 1990; Brunk and Pazzani, 1991; Weiss and Indurkhya, 1991; Cohen, 1993; Furnkranz and Widmer, 1994 ] however, it is computationally expensive for large datasets. In previous work [ Cohen, 1993 ] we showed that REP requires O(n 4 ) time, given sufficiently noisy data; in fact, even the initial phase of overfitting the training data requires O(n 2 ) time. We ....

[Article contains additional citation context not shown here]

William W. Cohen. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, France, 1993.


Applying Machine Learning Techniques in Air Quality Prediction - Kalapanidas, Avouris   (Correct)

No context found.

Cohen, W. (1993). Efficient pruning methods for separate-and-conquer rule learning systems. In Pro- ceedings of the 13 th International Joint Conference on Artificial Intelligence, pp.988-994. Morgan Kaufmann.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC