#### DMCA

## GREW—A Scalable Frequent Subgraph Discovery Algorithm (2003)

Venue: | in Fourth IEEE International Conference on Data Mining (ICDM 2004). 2004 |

Citations: | 21 - 0 self |

### Citations

14074 |
Computers and Intractability: A Guide to the Theory of NP-Completeness
- Garey, Johnson
- 1979
(Show Context)
Citation Context ...he fact that due to the completeness requirements these algorithms must perform numerous subgraph isomorphism operations (or their equivalent) explicitly or implicitly that are known to be NPcomplete =-=[8]-=-. On the other hand, existing heuristic algorithms, which are not guaranteed to find the complete set of subgraphs, as SUBDUE [2] and GBI [34], tend to findsan extremely small number of patterns and a... |

406 | Frequent subgraph discovery.
- Kuramochi, Karypis
- 2001
(Show Context)
Citation Context .... [15], originally developed in 1999–2000, the scalability of these frequent pattern/subgraph mining algorithms has continuously been improved and a number of different algorithms have been developed =-=[20, 17, 1, 32, 11, 13, 33, 19]-=- that employ different mining strategies, are designed for different input graph representations, and find patterns that have different characteristics and satisfy different constraints. As a result, ... |

310 | An Apriori-Based Algorithm for Mining Frequent Substructures from Graph Data
- Inokuchi, Washio, et al.
- 2000
(Show Context)
Citation Context ...esearch and computing facilities was provided by the Digital Technology Center and the Minnesota Supercomputing Institute. quent patterns in large graph datasets. Starting with AGM by Inokuchi et al. =-=[15]-=-, originally developed in 1999–2000, the scalability of these frequent pattern/subgraph mining algorithms has continuously been improved and a number of different algorithms have been developed [20, 1... |

199 | Substructure discovery using minimum description length and background knowledge. - Cook, Holder - 1994 |

194 | Efficient mining of frequent subgraphs in the presence of isomorphism
- Huan, Wang, et al.
- 2003
(Show Context)
Citation Context .... [15], originally developed in 1999–2000, the scalability of these frequent pattern/subgraph mining algorithms has continuously been improved and a number of different algorithms have been developed =-=[20, 17, 1, 32, 11, 13, 33, 19]-=- that employ different mining strategies, are designed for different input graph representations, and find patterns that have different characteristics and satisfy different constraints. As a result, ... |

186 | Graph-based data mining.
- Cook, Holder
- 2000
(Show Context)
Citation Context ...scale to large datasets. The most well-known algorithm for finding recurring subgraphs in a single large graph is the SUBDUE system, originally developed in 1994, but has been improved over the years =-=[12, 2, 4, 3]-=-. SUBDUE is an approximate algorithm and finds patterns that can compress the original input graph by substituting those patterns with a single vertex. In evaluating the extent to which a particular p... |

165 | M.R.: Mining molecular fragments: Finding relevant substructures of molecules
- Borgelt, Berthold
- 2002
(Show Context)
Citation Context .... [15], originally developed in 1999–2000, the scalability of these frequent pattern/subgraph mining algorithms has continuously been improved and a number of different algorithms have been developed =-=[20, 17, 1, 32, 11, 13, 33, 19]-=- that employ different mining strategies, are designed for different input graph representations, and find patterns that have different characteristics and satisfy different constraints. As a result, ... |

133 | Finding Frequent Substructures in Chemical Compounds”,
- Toivonen
- 1999
(Show Context)
Citation Context ...31, 22] and are directly related to the algorithms presented in this paper, whereas the second category contains algorithms that find subgraphs that occur frequently across a database of small graphs =-=[34, 5, 15, 18, 20, 17, 32, 1]-=-, Between these two classes of algorithms, those developed for the latter problem are in general more mature as they have moderate computational requirements and scale to large datasets. The most well... |

129 | Finding frequent patterns in a large sparse graph
- Kuramochi, Karypis
- 2005
(Show Context)
Citation Context ... find 3 subgraphs in 5,043 seconds, while VSIGRAM (a recently developed complete algorithm) was able to find 3,183 patterns in just 63 seconds from a graph containing 33,443 vertices and 11,244 edges =-=[22]-=-. There are many application domains that lead to datasets that have inherently structural or relational characteristics, are suitable for graph-based representations, and can greatly benefit from gra... |

120 | An efficient algorithm for discovering frequent subgraphs.
- Kuramochi, Karypis
- 2004
(Show Context)
Citation Context |

100 | Complete mining of frequent patterns from graphs: Mining graph data.
- Inokuchi, Washio, et al.
- 2003
(Show Context)
Citation Context ...process continues until there are no such candidate subgraphs whose combination will lead to a larger frequent subgraph. Note that unlike existing subgraph growing methods used by complete algorithms =-=[16, 19, 32, 22]-=-, which increase the size of each successive subgraph by one edge or vertex at a time, GREW, in each successive iteration, can potentially double the size of the subgraphs that it identifies. The key ... |

92 |
Molecular feature mining in hiv data
- Kramer, Raedt, et al.
- 2001
(Show Context)
Citation Context ...31, 22] and are directly related to the algorithms presented in this paper, whereas the second category contains algorithms that find subgraphs that occur frequently across a database of small graphs =-=[34, 5, 15, 18, 20, 17, 32, 1]-=-, Between these two classes of algorithms, those developed for the latter problem are in general more mature as they have moderate computational requirements and scale to large datasets. The most well... |

86 | The graph isomorphism problem
- FORTIN
- 1996
(Show Context)
Citation Context ...sequence of bits, a string, or a sequence of numbers) that is invariant on the ordering of the vertices and edges in the graph. Such a code is referred to as the canonical label of a graph G = (V, E) =-=[29, 7]-=-, and we will denote it by cl(G). By using canonical labels, we can check whether or not two graphs are identical by checking to see whether they have identical canonical labels. Moreover, the canonic... |

77 | Substructure Discovery in the SUBDUE Systems
- Holder, Cook, et al.
- 1994
(Show Context)
Citation Context ...overing scheme do not overlap. Because of this reason, the set of patterns found by the sequential covering scheme tend to have less diversity and be smaller. 5.4 Comparison with SUBDUE We ran SUBDUE =-=[12]-=- version 5.1.0 (with the default set of parameters) on our four benchmark datasets and measured the runtime, the number of patterns discovered, their size, and their frequency. Although we gave SUBDUE... |

75 | Greed is good: approximating independent sets in sparse and bounded degree graphs.
- Halldorsson, Radhakrishnan
- 1994
(Show Context)
Citation Context ...her edge-types. This step (loop starting at line 8) is achieved by constructing the overlap graph Go for the set of embeddings of each edge-type e and using a greedy maximal independent set algorithm =-=[10]-=- to quickly identify a large number of vertex-disjoint embeddings. If the size of this maximal set is greater than the minimum frequency threshold, this edge-type survives the current iteration and th... |

30 | Accurate classification of protein structural families using coherent subgraph analysis
- Huan, Wang, et al.
- 2004
(Show Context)
Citation Context ... domains to discover frequently occurring subgraphs in a reasonable amount of time. Moreover, since these patterns can be used as input to other data mining tasks (e.g., clustering and classification =-=[6, 14]-=-), the frequent pattern discovery algorithms play an important role in further expanding the use of data mining techniques to graph-based datasets. A key characteristic of all of these algorithms is t... |

22 | Large Scale Mining of Molecular Fragments with Wildcards. Intelligent Data Analysis 8:495–504
- Hofer, Borgelt, et al.
- 2004
(Show Context)
Citation Context |

20 |
Frequent sub-structure based approaches for classifying chemical compounds.
- Deshpande, Kuramochi, et al.
- 2003
(Show Context)
Citation Context ... domains to discover frequently occurring subgraphs in a reasonable amount of time. Moreover, since these patterns can be used as input to other data mining tasks (e.g., clustering and classification =-=[6, 14]-=-), the frequent pattern discovery algorithms play an important role in further expanding the use of data mining techniques to graph-based datasets. A key characteristic of all of these algorithms is t... |

20 |
A Fast Algorithm for Mining Frequent Connected Subgraphs,
- Inokuchi, Washio, et al.
- 2002
(Show Context)
Citation Context |

15 | SEuS: Structure extraction using summaries
- Ghazizadeh, Chawathe
- 2002
(Show Context)
Citation Context ...vious research on finding frequent subgraphs in graph datasets falls under two categories. The first category contains algorithms that find subgraphs that occur multiple times in a single large graph =-=[12, 9, 31, 22]-=- and are directly related to the algorithms presented in this paper, whereas the second category contains algorithms that find subgraphs that occur frequently across a database of small graphs [34, 5,... |

13 |
Frequent subgraph discovery.
- Karypis, Kuramochi
- 2001
(Show Context)
Citation Context |

10 | Knowledge discovery from structural data
- Cook, Holder, et al.
- 1995
(Show Context)
Citation Context ...scale to large datasets. The most well-known algorithm for finding recurring subgraphs in a single large graph is the SUBDUE system, originally developed in 1994, but has been improved over the years =-=[12, 2, 4, 3]-=-. SUBDUE is an approximate algorithm and finds patterns that can compress the original input graph by substituting those patterns with a single vertex. In evaluating the extent to which a particular p... |

8 | Mining Patterns from Structured Data by Beam-wise Graph-Based Induction,
- Matsuda, Motoda, et al.
- 2002
(Show Context)
Citation Context ...me it prevents it from finding subgraphs that are indeed frequent. Motoda et al. developed an algorithm called GBI [34] which is similar to SUBDUE and later proposed the improved version called B-GBI =-=[23]-=- adopting the beam search. B-GBI is the closest algorithm to our study, in the sense that both perform the same basic operation to identify frequent patterns based on edge contraction. However, while ... |