Abstract:
Abstract. Mining frequent trees is very useful in domains like bioinformatics, web mining, mining semi-structured data, and so on. In this paper we introduce SLEUTH, an efficient algorithm for mining frequent, unordered, embedded subtrees in a database of labeled trees. The key contributions of our work are as follows: We give the first algorithm that enumerates all embedded, unordered trees. We propose a new equivalence class extension scheme to generate all candidate trees. We extend the notion of scope-list joins to compute frequency of unordered trees. We conduct performance evaluation on several synthetic and real datasets to show that SLEUTH is an efficient algorithm, which has performance comparable to TreeMiner, that mines only ordered trees.
Citations
|
370
|
Fast discovery of association rules
– Agrawal, Mannila, et al.
- 1996
|
|
177
|
gspan: Graph-based substructure pattern mining
– Yan, Han
- 2002
|
|
134
|
An apriori-based algorithm for mining frequent substructures from graph data
– Inokuchi, Washio, et al.
- 2000
|
|
102
|
Efficiently mining frequent trees in a forest
– Zaki
- 2002
|
|
92
|
CloseGraph: Mining closed frequent graph patterns
– Yan, Han
- 2003
|
|
86
|
Finding frequent substructures in chemical compounds
– Dehaspe, Toivonen, et al.
- 1998
|
|
47
|
Ordered and unordered tree inclusion
– Kilpelainen, Mannila
- 1995
|
|
40
|
XRules: An Effective Structural Classifier for XML Data
– Zaki, Aggarwal
- 2003
|
|
33
|
Tree pattern matching and subset matching in deterministic O(n log n)-time
– Cole, Hariharan, et al.
- 1999
|
|
29
|
Efficient discovery of frequent unordered trees
– Nijssen, Kok
- 2003
|
|
29
|
CLIP: Concept learning from inference patterns
– Yoshida, Motoda
- 1995
|
|
27
|
Faster subtree isomorphism
– Shamir, Tsur
- 1999
|
|
26
|
Indexing and mining free trees
– Chi, Yang, et al.
- 2003
|
|
14
|
Molecular Feature Mining
– Kramer, Raedt, et al.
- 2001
|
|
13
|
Efficiently mining frequent embedded unordered trees
– Zaki
- 2005
|
|
9
|
HybridTreeMiner: An Efficient Algorihtm for Mining Frequent Rooted Trees and Free Trees Using Canonical Forms
– Chi, Yang, et al.
- 2004
|
|
7
|
Efficient Data Mining for
– Yao, Zhao
- 2003
|
|
5
|
Substructure discovery using minimal description length and background knowledge
– Cook, Holder
- 1994
|
|
5
|
Efficient Mining of Frequent Subgraph
– Huan, Wang, et al.
- 2003
|
|
2
|
R.: CMTreeMiner: Mining Both Closed and
– Chi, Yang, et al.
- 2004
|
|
2
|
Web-Crawling up the Tree of
– Morell
- 1996
|
|
1
|
M.: TreeFinder: a First Step towards XML
– Termier, Rousset, et al.
|
|
1
|
H.: Discovering Typical Structures of Documents: A
– Wang, Liu
- 1998
|