| Y. Wang and D.B. Skillicorn. Parallel inductive logic for data mining. In Workshop on Distributed and Parallel Knowledge Discovery, KDD2000, Boston, to appear. ACM Press. 22 |
....terms. Rather, the need is to beat back the large constants brought in by large real world applications. Yu Wang and David Skillicorn recently developed a parallel implementation of PROGOL under the Bulk Synchronous Parallel (BSP) model and claim superlinear speedup from this implementation [38]. Alan Wild worked with me at the University of Louisville to re implement on a Beowulf cluster a top down ILP search for pharmacophore discovery, and the result was a linear speedup [13] The remainder of this section described how large scale parallelism can be achieved very simply in a top down ....
Y. Wang and D. Skillicorn. Parallel inductive logic for data mining. http://www.cs.queensu.ca/home/skill/papers.html#datamining, 2000.
....overheads in exchanging OC1 trees and voting after each round, which suggests that speedups may be poor. On the other hand, each processor is acquiring information learned by all of the other processors in a compact way, and this has often led to superlinear speedups in data mining applications [12, 15]. The effect of threshold. The threshold determines how many decision trees must agree for an example to be classified as easy. Figures 12 and 13 show that choosing a threshold value of 2 is best for the letters dataset, while Figures 14 and 15 show that choosing a threshold value of 3 is best ....
Y. Wang and D.B. Skillicorn. Parallel inductive logic for data mining. In Workshop on Distributed and Parallel Knowledge Discovery, KDD2000, Boston, to appear. ACM Press. 22
....this requires the possibility of being able to combine information learned by other processors into the model under construction, so this approach will not work for all data mining algorithms. However, it is effective for several, including neural networks [30, 29] and inductive logic programming [42]. The BSP Model 27 The information that is exchanged is quite small compared to the size of typical datasets, so the communication overhead of this approach is low. Interestingly, superlinear speedup occurs because there are two improvements over executing the plain sequential algorithm. The ....
Y. Wang and D.B. Skillicorn. Parallel inductive logic for data mining. In Workshop on Distributed and Parallel Knowledge Discovery, KDD2000, Boston, to appear. ACM Press.
.... result of Hj find the successful H with globally good score total exchange the valid Hi s add all valid Hi s into B retract redundant examples that covered by Hi 0 s end if end repeat end forall Figure 2: Parallel ILP Algorithm shown to be effective, and we use that approach here [9, 12]. Figure 2 shows how the sequential algorithm is parallelized. Each processor gets a partition of the original dataset and executes essentially the sequential algorithm on its partition to find a concept. Such a concept is locally correct, but may not necessarily be globally correct. All ....
Yu Wang. Parallel inductive logic in data mining. Master's thesis, Department of Computing and Information Science, Queen's University, 2000.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC