Results 1 -
5 of
5
GAIA: Graph Classification Using Evolutionary Computation
"... Discriminative subgraphs are widely used to define the feature space for graph classification in large graph databases. Several scalable approaches have been proposed to mine discriminative subgraphs. However, their intensive computation needs prevent them from mining large databases. We propose an ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Discriminative subgraphs are widely used to define the feature space for graph classification in large graph databases. Several scalable approaches have been proposed to mine discriminative subgraphs. However, their intensive computation needs prevent them from mining large databases. We propose an efficient method GAIA for mining discriminative subgraphs for graph classification in large databases. Our method employs a novel subgraph encoding approach to support an arbitrary subgraph pattern exploration order and explores the subgraph pattern space in a process resembling biological evolution. In this manner, GAIA is able to find discriminative subgraph patterns much faster than other algorithms. Additionally, we take advantage of parallel computing to further improve the quality of resulting patterns. In the end, we employ sequential coverage to generate association rules as graph classifiers using patterns mined by GAIA. Extensive experiments have been performed to analyze the performance of GAIA and to compare it with two other state-ofthe-art approaches. GAIA outperforms the other approaches both in terms of classification accuracy and runtime efficiency.
Infrastructure Pattern Discovery in Configuration Management Databases via Large Sparse Graph Mining
"... A configuration management database (CMDB) can be considered to be a large graph representing the IT infrastructure entities and their inter-relationships. Mining such graphs is challenging because they are large, complex, and multi-attributed, and have many repeated labels. These characteristics po ..."
Abstract
- Add to MetaCart
A configuration management database (CMDB) can be considered to be a large graph representing the IT infrastructure entities and their inter-relationships. Mining such graphs is challenging because they are large, complex, and multi-attributed, and have many repeated labels. These characteristics pose challenges for graph mining algorithms, due tothe increased cost of subgraph isomorphism (for support counting), and graph isomorphism (for eliminating duplicate patterns). The notion of pattern frequency or support is also more challenging in a single graph, since it has to be defined in terms of the number of its (potentially, exponentially many) embeddings. We present CMDB-Miner, a novel two-step methodfor mininginfrastructurepatternsfrom CMDBgraphs. It first samples the set of maximal frequent patterns, and then clusters them to extract the representative infrastructure patterns. We demonstrate the effectiveness of CMDB-Miner on real-world CMDB graphs.
Sampling Minimal Frequent Boolean (DNF) Patterns
"... We tackle the challenging problem of mining the simplest Boolean patterns from categorical datasets. Instead of complete enumeration, which is typically infeasible for this class of patterns, we develop effective sampling methods to extract a representative subset of the minimal Boolean patterns (in ..."
Abstract
- Add to MetaCart
We tackle the challenging problem of mining the simplest Boolean patterns from categorical datasets. Instead of complete enumeration, which is typically infeasible for this class of patterns, we develop effective sampling methods to extract a representative subset of the minimal Boolean patterns (in disjunctive normal form – DNF). We make both theoretical and practical contributions, which allow us to prune the search space based on provable properties. Our approach can provide a near-uniform sample of the minimal DNF patterns. We also show that the mined minimal DNF patterns are very effective when used as features for classification.

