| Blockeel et al., H.: Executing query packs in ILP. In: Cussens, J. and Frish, A. (eds.): Inductive Logic Programming, 10th International Conference, ILP2000, London, U.K. Lecture Notes in Artificial Intelligence, Vol. 1866, pages 60--77. Springer Verlag, Heidelberg, Germany (2000) |
....examples being covered. Unfortunately, the search space can grow very quickly in ILP applications. Several techniques have therefore been proposed to improve search efficiency. Such techniques include improving computation times at individual nodes [4, 26] better representations of the search [3], sampling the search space [27, 28, 32] and parallelism [8, 13,19] Parallelism can be obtained from very different alternative approaches, such as dividing the search tree, dividing the examples, or even through performing cross validation in parallel [31] An intriguing alternative approach ....
H. Blockeel, L. Dehaspe, B. Demoen, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the loth International Conference on Inductive Logic Programming, volume 1866 of Lecture Notes in Artificial Intelligence, pages 60-77. Springer-Verlag, 2000.
....typically arise when data are not loaded into main memory but reside on disk; or in the context of inductive logic programming where evaluating a test on an example can be expensive. ffl An implementation of the proposed algorithm has been included in the inductive logic programming tool ACE [2], which contains a component for decision tree induction [1] Using this implementation we have been able to validate our complexity analysis empirically, and to estimate speedup factors that can actually occur. In the best case the full cross validation generated less than 30 overhead, instead ....
H. Blockeel, B. Demoen, L. Dehaspe, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference in Inductive Logic Programming, volume 1866 of Lecture Notes in Artificial Intelligence, pages 60--77, London, UK, July 2000. Springer.
....the course of the project. For more information and details on the Aladin project, we refer to the webpage http: www. cs. kuleuven. ac. be dtai Aladin . Bibliographical notes This chapter is based on the K.U. Leuven deliverables of the Aladin project and the joint work of our research group in [Blockeel et al. 2000b] 7.2 Extended Input Interface In this section, we have a closer look at the input of complex data (relational knowledge, background information, and so on) into the ILP systems. We have designed an Extended Input Interface such that an ILP engine can load data directly from a (relational) ....
....allows for significant speedups but changes the results. More recently, some new (exact) optimizations have been proposed. Without altering the outcome of the coverage test of a hypothesis w.r.t. an example, the execution is optimized through the efficient execution of packs of queries hypotheses ([Blockeel et at. 2000a; Blockeel et at. 2002] and some advanced transformations of queries which preserve the equivalence ( Santos Costa et at. 2000; Blockeel et at. 2000b] These issues will be discussed in the following subsections. A last important factor is the implementation of the complete system. Most ILP ....
[Article contains additional citation context not shown here]
H. Blockeel, B. Demoen, L. Dehaspe, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the loth International Conference in Inductive Logic Programming, volume 1866 of Lecture Notes in Artificial Intelligence, pages 60-77, London, UK, July 2000. Springer.
No context found.
H. Blockeel, B. Demoen, L. Dehaspe, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference in Inductive Logic Programming, volume 1866 of Lecture Notes in Arti cial Intelligence, pages 60-77, London, UK, July 2000. Springer.
.... neither to compare our collectors with those of other Prolog systems, nor compare collection algorithms (e.g. sliding versus copying) The system chosen to implement our new garbage collection schemes is ilProlog, a Prolog system that is heavily used for machine learning applications; see e.g. [6]. ilProlog is WAMbased; see [19, 1] Its main deviation from the WAM is that free variables are always put on the heap; see for instance [10] Another deviation, like in many other Prolog systems, is to have separate stacks for the environments and the choice points. So we expect that our ....
.... and performs a typical run of an inductive learning tool written in ilProlog: without garbage collection, this application would not be able to run (this can also be seen by the number of collected garbage cells) The program uses abstract machine extensions that eciently support query packs (see [6], also for more detail on this application) and thus does not run without changes in standard Prolog systems. The rst new collector we evaluate is referred to as sop gc and implements the segment order preserving schema as described above. Table 2 shows data about m c gc and sop gc, in ....
H. Blockeel, L. Dehaspe, B. Demoen, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference on Inductive Logic Programming, number 1866 in LNAI, pages 60-77, Springer, July 2000.
....tree sizes are similar. For the trees induced on the full training set, it can be seen that the relational one is smaller. Table 4 also gives the times used by Tilde to induce these trees. Recently significant improvements are made in the technology of first order learning algorithms (see e.g. [3]) sometimes increasing their speed by a factor of 40. Hence, the absolute numbers do not say much and will soon be reduced further. Their relative size is interesting from the machine learning point of view as inducing relational trees takes more time. More important is the time needed to execute ....
H. Blockeel, B. Demoen, L. Dehaspe, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference in Inductive Logic Programming, Lecture Notes in Artificial Intelligence, London, UK, July 2000. Springer. Jan Ramon, Tom Francis, and Hendrik Blockeel
.... neither to compare our collectors with those of other Prolog systems, nor compare collection algorithms (e.g. sliding versus copying) The system chosen to implement our new garbage collection schemes is ilProlog, a Prolog system that is heavily used for machine learning applications; see e.g. [5]. ilProlog is WAMbased; see [16, 1] Its main deviation from the WAM is that free variables are always put on the heap; see for instance [8] Another deviation, like in many other Prolog systems, is to have separate stacks for the environments and the choice points. So we expect that our ....
.... and performs a typical run of an inductive learning tool written in ilProlog: without garbage collection, this application would not be able to run (this can also be seen by the number of collected garbage cells) The program uses abstract machine extensions that eciently support query packs (see [5], also for more detail on this application) and thus does not run without changes in standard Prolog systems. The rst new collector we evaluate is referred to as sop gc and implements the segment order preserving schema as described above. Table 2 shows data about m c gc and sop gc, in ....
H. Blockeel, L. Dehaspe, B. Demoen, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In Proceedings of the 10th Conference on Inductive Logic Programming, number 1866 in LNAI, pages 60-77, Springer, July 2000.
....could be enriched using a query language for trend shapes, such as the one proposed by [12] 4 Selected Results In this section, we report on some of the results obtained in our case study. For these experiments we used the Warmr algorithm as implemented in the data mining tool ACE ilProlog [6], version 1.1.6. We used the default settings for Warmr, except for a minimal support of 0.01 (the default is 0.1) No minimal confidence was specified for the generated rules. The used background knowledge allowed the system to consider the events of child birth, start of a marriage and start of ....
H. Blockeel, L. Dehaspe, B. Demoen, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch (eds.) Proceedings of the 10th International Conference on Inductive Logic Programming (ILP-
....in this paper is in the long tradition of source to source program transformations: changes at the source level that can improve eciency without altering correctness [21] The suggestions here by no means exhaust the transformations of this type. Within ILP, a related approach is described in [2], where a set of queries is restructured so that they can be executed more eciently, without changing the individual queries however. The two approaches are obviously complementary, and it would be interesting to see how they can be combined. Finally, we remark that obtaining eciency gains by ....
H. Blockeel, B. Demoen, L. Dehaspe, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference in Inductive Logic Programming, volume 1866 of Lecture Notes in Articial Intelligence, pages 60-77, London, UK, July 2000. Springer.
.... and Data Sets The following data mining tools and techniques were used in these experiments: ffl Weka [13] a general purpose data mining tool from which we used : J48, a decision tree induction [5, 11] system based on Quinlan s C4.5 [12] and several feature selection algorithms [13] ffl ACE [3], an inductive logic programming [9] tool, in which we used : Tilde, an algorithm that induces first order decision trees [2] and is based on C4.5 [12] and Warmr, an algorithm for first order pattern discovery [6] based on Apriori [1] The tools were used in combination with various data sets ....
H. Blockeel, B. Demoen, L. Dehaspe, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference in Inductive Logic Programming, volume 1866 of Lecture Notes in Artificial Intelligence, pages 60--77, London, UK, July 2000. Springer.
....more variables it contains, the larger is the number of possible ways to bind the variables and the larger is the set of candidate tests. Since a large number of such candidate tests exist, we need to store them as eciently as possible. To this aim we use the query packs mechanism introduced in [3]. A query pack is a set of similar queries structured into a tree; common parts of the queries are represented only once in such a structure. For instance, a set of conjunctions f(p(X) q(X) p(X) r(X) g is represented as a term p(X) q(X) r(X) This can yield a signi cant gain in practice. ....
H. Blockeel, B. Demoen, L. Dehaspe, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference in Inductive Logic Programming, volume 1866 of Lecture Notes in Articial Intelligence, pages 60-77, London, UK, July 2000. Springer.
....sequences in which the order of the items is important. We illustrate this with an example. Suppose that our dataset contains the frequent sequence 2 all timings in this paper are on a Pentium III 800 MHz computer with 256 Mb memory, running linux and the Ace version of Warmr running ilProlog[1] a b c d. When Warmr has reached its second level, it has constructed these frequent patterns: a b b c c d However, when Warmr extends the rst pattern a b, it will extend it to a b a a b b a b c a b d and test each of these patterns on the dataset, to ....
H. Blockeel, B. Demoen, L. Dehaspe, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference in Inductive Logic Programming, volume 1866 of Lecture Notes in Articial Intelligence, pages 60-77, London, UK, July 2000. Springer.
....analysis, which makes it reasonable to expect a speedup factor close to n for refinement of the top node of the tree; and close to n=f(i Gamma 1) for nodes on level i. 4. 2 Experimental Setup For these experiments we used the version of Tilde as implemented within the ACE data mining tool 4 (Blockeel et al. 2000); this version is a depth first ID3like algorithm that keeps all data in main memory. With these experiments we aim at a better understanding of the behaviour of the parallel crossvalidation process. We measure how much speedup the parallel procedure yields, compared to the serial one; how the ....
....by quickly abandoning models that after seeing some examples have low probability of ever becoming the best model; i.e. they save on the number of cases a model is evaluated on during cross validation, whereas our work focuses on removing redundancy in the model building process itself. Blockeel et al. 2000) discuss a technique similar to the one described here. The main difference is is in the kind of redundancies that are removed; here the redundancies arise from running the same test in different folds of a cross validation, whereas in Blockeel et al. 2000) they are caused by similarities in ....
[Article contains additional citation context not shown here]
Blockeel, H., Demoen, B., Dehaspe, L., Janssens, G., Ramon, J., & Vandecasteele, H. (2000). Executing query packs in ILP. Proceedings of the 10th International Conference in Inductive Logic Programming (pp. 60--77). London, UK: Springer.
....re nement of a query is obtained by extending it with new literals. This means that di erent re nements of the same query are highly similar (share literals) One can imagine that there will be redundant computations when testing these similar queries separately on the training set. It is shown in [2] that this kind of redundancies can be removed by integrating the similar queries in one so called query pack. A rst goal of this text is to discuss ecient cross validation from an ILP point of view. We use decision tree induction to explain the concepts, but the method for ecient ....
....of higher level nodes. This query dependency problem also occurs to some extent for rule induction. We show how the parallel cross validation algorithm from [4] can be adapted to reduce the overhead caused by this problem. A second goal of this text is to investigate how the query packs from [2] can be integrated in the parallel cross validation algorithm. This paper is organised as follows. Section 2 summarises logical decision tree induction, ecient decision tree cross validation and query packs. Section 3 discusses the query dependency problem, shows how query packs can be integrated ....
[Article contains additional citation context not shown here]
H. Blockeel, B. Demoen, L. Dehaspe, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference in Inductive Logic Programming, volume 1866 of Lecture Notes in Articial Intelligence, pages 60-77, London, UK, July 2000. Springer.
....yield results the other missed. Gartner [7] provides an overview of the approaches that were investigated by several (not all) data mining groups involved in the SolEuNet project. The main 1 Lightning is not necessary, but can be added for effect. tools used were Weka [15] Kepler [17] and ACE [4]. All these tools allow the user to run a number of data mining algorithms on a given data set. The data mining approaches that were considered were decision trees : J48, the Weka implementation of C4.5 [11] and Tilde [2] an ILP system that is a first order upgrade of C4.5 1R, a simple ....
H. Blockeel, B. Demoen, L. Dehaspe, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference in Inductive Logic Programming, volume 1866 of Lecture Notes in Artificial Intelligence, pages 60--77, London, UK, July 2000. Springer.
....in [13] but also to a transformation described in [1] they clarify the relationship between both and also include the promise of obtaining even higher efficiency gains by combining them. We furthermore discuss how these query transformation techniques can be combined with query pack execution [3], another technique for gaining efficiency. We start with sketching the context of the transformations and a brief review of the transformations proposed in [13, 1] which are used as a starting point for our work. In Section 4 we discuss the optimisation of individual queries; this extends one of ....
.... individual queries; this extends one of the optimisations in [13] In Section 5 we discuss the optimisation of queries relative to other queries; this extends an optimisation described in [1] In Section 6 we illustrate how these transformations can be combined with pack execution as described in [3]. In Section 7 we conclude. 2 The context In the ILP setting we are interested in, a database of examples is queried repeatedly. Since the general framework is that of Horn clause logic and implementations are typically in Prolog, such a query is best seen as a conjunction of Prolog goals. Such ....
[Article contains additional citation context not shown here]
H. Blockeel, B. Demoen, L. Dehaspe, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference in Inductive Logic Programming, Lecture Notes in Artificial Intelligence, London, UK, July 2000. Springer.
No context found.
Blockeel et al., H.: Executing query packs in ILP. In: Cussens, J. and Frish, A. (eds.): Inductive Logic Programming, 10th International Conference, ILP2000, London, U.K. Lecture Notes in Artificial Intelligence, Vol. 1866, pages 60--77. Springer Verlag, Heidelberg, Germany (2000)
No context found.
Blockeel et al., H.: Executing query packs in ILP. In: Cussens, J. and Frish, A. (eds.): Inductive Logic Programming, 10th International Conference, ILP2000, London, U.K. Lecture Notes in Artificial Intelligence, Vol. 1866, pages 60--77. Springer Verlag, Heidelberg, Germany (2000)
No context found.
H. Blockeel, L. Dehaspe, B. Demoen, G. Janssens, J. Ramon, and H. Vandecasteele. Executing query packs in ILP. In J. Cussens and A. Frisch, editors, Proceedings of the 10th International Conference on Inductive Logic Programming, volume 1866 of Lecture Notes in Arti cial Intelligence, pages 60-77. Springer-Verlag, 2000.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC