| Dehaspe, L. and Toivonen, H. 1999. Frequent query discovery: a unifying ILP approach to association rule mining. Data Mining and Knowledge Discovery. To appear. |
....propositional representation also includes logP and molecular weight. Many of the functional groups have been selected from the PTE (predictive toxicology evaluation) domain theory (Srinivasan et al. 1997) where the task is to predict carcinogenicity of chemicals. In this domain, the approach of (Dehaspe et al. 1999) to discover (count) most frequent substructures that occur in the dataset and use these in conjunction with propositional learners has been among the most successful. Our small substructure representation has been derived along these lines. 3.2 Systems A variety of classification and regression ....
Dehaspe, L. and Toivonen, H. 1999. Frequent query discovery: a unifying ILP approach to association rule mining. Data Mining and Knowledge Discovery. To appear.
....representation also includes logP and molecular weight. Many of the functional groups have been selected from the PTE (predictive toxicology evaluation) domain theory [20] where the task is to predict carcinogenicity of chemicals. In this domain, the approach of Dehaspe and Toivonen [7] to discover (count) most frequent substructures that occur in the dataset and use these in conjunction with propositional learners has been among the most successful. Our small substructure representation has been derived along these lines. 3.2 Systems A variety of classification and regression ....
Dehaspe, L., and Toivonen, H. 1999. Frequent query discovery: a unifying ILP approach to association rule mining. Data Mining and Knowledge Discovery.
....in different approach like [7] or [22] 7] propose an algorithm that generalize the A priori trick to the context of frequent atomsets. This typical inductive logic programming tool enable to mine association rules from multiple relations but can also be used for mining frequent Datalog queries [8]. 22] consider query flocks that are parametrized Datalog queries for which a selection criteria on the result of the queries must hold. When the filter condition is related to the frequency of answers and queries are conjunctive queries augmented with arithmetic and union, they can propose an ....
L. Dehaspe and H. Toivonen. Frequent query discovery: A unifying ILP approach to association rule mining. Technical Report CW-258, Department of Computer Science, Katholieke Universiteit Leuven, Belgium, March 1998. Available at http://www.cs.kuleuven.ac.be/¸publicaties/rapporten/ CW1998.html.
....of the relevant patterns. Generic mining algorithms should also be defined. Interesting ideas come from recent generalizations of apriori. 4] generalizes it in the context of frequent atomsets. It provides an inductive logic programming tool that mines the so called frequent Datalog queries [5]. 18] consider query flocks that are parametrized Datalog (or SQL) queries for which filter condition is related to the number of parameters values. They propose an optimizing scheme that provides subqueries for eliminating parameter values at a cheaper price. 5 Conclusion The concept of ....
L. Dehaspe and H. Toivonen. Frequent query discovery: a unifying ILP approach to association rule mining. Technical Report CW-258, Department of Computer Science, Katholieke Universiteit Leuven, Belgium, March 1998.
.... patterns (Agrawal Srikant 1995) a family of problems discussed in more general in (Mannila Toivonen 1997) Within ILP, a closely related problem is the discovery of queries in first order logic that succeed with respect to a sufficient number of examples (Dehaspe De Raedt 1997) In (Dehaspe Toivonen 1998) we discuss the relationship of ILP to frequent pattern discovery, and relate data mining problems to ILP. The logical setting for substructure discovery is based on the learning from interpretations paradigm introduced in (De Raedt Dzeroski 1994) Frequent substructure discovery Discovery ....
....a database r, a class L of sentences (patterns) and a selection predicate q which is used for evaluating whether a sentence Q 2 L defines a potentially interesting pattern in r. The task is to find the theory of r with respect to L and q, i.e. the set Th(L; r; q) fQ 2 L j q(r; Q) is trueg: In (Dehaspe Toivonen 1998), this framework has been used to formulate the task of frequent query discovery in Datalog. We now define frequent substructure discovery as a special case of frequent query discovery. Definition 1 (Frequent substructure discovery) Assume ffl r is a Datalog database of chemical compounds, ....
[Article contains additional citation context not shown here]
Dehaspe, L., and Toivonen, H. 1998. Frequent query discovery: a unifying ILP approach to association rule mining. Technical Report CW-258, K.U.Leuven. http: //www.cs.kuleuven.ac.be/publicaties/rapporten/ CW1998.html.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC