#### DMCA

## Outlier Detection in Graph Streams

Citations: | 20 - 6 self |

### Citations

494 | LOF: Identifying density-based local outliers
- Breuning, Kriegel, et al.
- 2000
(Show Context)
Citation Context ... results. Section V presents the conclusions and summary. A. Related Work and Contributions The problem of outlier detection has been studied extensively in the context of multi-dimensional data [7], =-=[9]-=-, [17], [18]. These techniques are mostly either distance-based [17], [18] or density-based [7], [9] methods. However, these methods cannot be easily generalized to non-spatial networks. The problem o... |

349 | A Framework for Clustering Evolving Data Streams
- Aggarwal, Han, et al.
(Show Context)
Citation Context ...m. The mean and standard deviation can be directly computed from the above values, because both of the these statistical quantities can be expressed as a closed function of moments of order at most 2 =-=[6]-=-. III. STRUCTURAL RESERVOIR SAMPLING: METHODOLOGY AND APPLICATIONS In this section, we will study the methodology of structural reservoir sampling, and its applicability to the problemof model estima... |

331 | Random sampling with a reservoir
- Vitter
- 1985
(Show Context)
Citation Context ...es of the graph with specific structural criteria satisfying a general condition referred to as set monotonicity. While reservoir sampling is typically used in the context of the stream scenario [5], =-=[20]-=- with particular temporal statistical properties, there is no known research on the topic of maintaining reservoir samples which leverage the underlying structural properties of graph streams. This ki... |

311 | Efficient algorithms for mining outliers from large data sets
- Ramaswamy, Rastogi, et al.
- 2000
(Show Context)
Citation Context ...ection V presents the conclusions and summary. A. Related Work and Contributions The problem of outlier detection has been studied extensively in the context of multi-dimensional data [7], [9], [17], =-=[18]-=-. These techniques are mostly either distance-based [17], [18] or density-based [7], [9] methods. However, these methods cannot be easily generalized to non-spatial networks. The problem of graph outl... |

224 | Outlier detection for high dimensional data
- Aggarwal, Yu
(Show Context)
Citation Context ...ental results. Section V presents the conclusions and summary. A. Related Work and Contributions The problem of outlier detection has been studied extensively in the context of multi-dimensional data =-=[7]-=-, [9], [17], [18]. These techniques are mostly either distance-based [17], [18] or density-based [7], [9] methods. However, these methods cannot be easily generalized to non-spatial networks. The prob... |

179 | Distance-based outliers: Algorithms and applications
- Knorr, Ng, et al.
- 2000
(Show Context)
Citation Context ...lts. Section V presents the conclusions and summary. A. Related Work and Contributions The problem of outlier detection has been studied extensively in the context of multi-dimensional data [7], [9], =-=[17]-=-, [18]. These techniques are mostly either distance-based [17], [18] or density-based [7], [9] methods. However, these methods cannot be easily generalized to non-spatial networks. The problem of grap... |

163 |
Data structures for on-line updating of minimum spanning trees, with applications
- Frederickson
- 1985
(Show Context)
Citation Context ... the r different partitioning structures induced by the different reservoirs. Efficient algorithms for dynamically maintaining spanning forests of incrementally updated sets of edges are discussed in =-=[10]-=-, [12]. IV. EXPERIMENTAL RESULTS In this section, we tested our outlier detection approach for effectiveness and efficiency on a number of real and synthetic data sets. We refer to our approach as the... |

100 | Random sampling in cut, flow, and network design problems
- Karger
- 1994
(Show Context)
Citation Context ...tiple choices of edge samples, in order to create different kinds of node partitions. This is used in order to improve the robustness of abnormality estimation. The sampling methods discussed in [3], =-=[14]-=- are not applicable to the case of the stream scenario and are also not designed for maintaining specific structural properties in the underlying partitions. In our particular case, the structural con... |

93 | Graph clustering based on structural/attribute similarities
- ZHOU, CHENG, et al.
- 2009
(Show Context)
Citation Context ...orks, IP networks and internet applications lead to the creation of massive streams of graph data. This has lead to an increasing interest in the problem of mining dynamic graphs [1], [2], [4], [16], =-=[21]-=-. In a graph stream, we assume that individual graph objects are received continuously over time. Some examples of such application scenarios are as follows: • Many information network objects can be ... |

84 | Ranking-based clustering of heterogeneous information networks with star network schema
- Sun, Yu, et al.
- 2009
(Show Context)
Citation Context ...ume that individual graph objects are received continuously over time. Some examples of such application scenarios are as follows: • Many information network objects can be expressed as graph objects =-=[19]-=-. For example, a bibliographic object from the DBLP network may be expressed as a graph with nodes corresponding to authors, conference, or topic area. The graph for the object may be represented in a... |

73 | oddball: Spotting anomalies in weighted graphs
- Akoglu, McGlohon, et al.
(Show Context)
Citation Context ...work of clustering has also been studied in the context of graphs and graph streams [1], [2], [4], [16], [19], [21]. A number of different methods for network outlier detection have been discussed in =-=[8]-=-, [11], [13]. However, these methods are only applicable to static networks, and not generally applicable to the case of dynamic graph streams. The dynamic nature of graph streams presents a special c... |

70 | Managing and Mining Graph Data
- Aggarwal, Wang
- 2010
(Show Context)
Citation Context ...s such as social networks, IP networks and internet applications lead to the creation of massive streams of graph data. This has lead to an increasing interest in the problem of mining dynamic graphs =-=[1]-=-, [2], [4], [16], [21]. In a graph stream, we assume that individual graph objects are received continuously over time. Some examples of such application scenarios are as follows: • Many information n... |

69 | V.: Randomized fully dynamic graph algorithms with polylogarithmic time per operation
- Henzinger, King
- 1999
(Show Context)
Citation Context ... different partitioning structures induced by the different reservoirs. Efficient algorithms for dynamically maintaining spanning forests of incrementally updated sets of edges are discussed in [10], =-=[12]-=-. IV. EXPERIMENTAL RESULTS In this section, we tested our outlier detection approach for effectiveness and efficiency on a number of real and synthetic data sets. We refer to our approach as the the G... |

59 |
An introduction to social network data analytics
- Aggarwal
- 2011
(Show Context)
Citation Context ...h as social networks, IP networks and internet applications lead to the creation of massive streams of graph data. This has lead to an increasing interest in the problem of mining dynamic graphs [1], =-=[2]-=-, [4], [16], [21]. In a graph stream, we assume that individual graph objects are received continuously over time. Some examples of such application scenarios are as follows: • Many information networ... |

59 |
S.: An efficient heuristic for partitioning graphs
- KERNIGHAN, LIN
- 1970
(Show Context)
Citation Context ...ta. In fact, it has been shown in [14], how edge sampling approaches have much higher likelihood of containing cuts with lower value. A natural solution would be to use a graph-partitioning technique =-=[15]-=-, but this cannot be implemented efficiently for the case of graph stream. We will provide an intuitive qualitative argument as to why such an edge sampling technique should work well, in addition to ... |

35 | A particle-and-density based evolutionary clustering method for dynamic networks
- Kim, Han
- 2009
(Show Context)
Citation Context ...l networks, IP networks and internet applications lead to the creation of massive streams of graph data. This has lead to an increasing interest in the problem of mining dynamic graphs [1], [2], [4], =-=[16]-=-, [21]. In a graph stream, we assume that individual graph objects are received continuously over time. Some examples of such application scenarios are as follows: • Many information network objects c... |

33 | On community outliers and their efficient detection in information networks
- Gao, Liang, et al.
- 2010
(Show Context)
Citation Context ...of clustering has also been studied in the context of graphs and graph streams [1], [2], [4], [16], [19], [21]. A number of different methods for network outlier detection have been discussed in [8], =-=[11]-=-, [13]. However, these methods are only applicable to static networks, and not generally applicable to the case of dynamic graph streams. The dynamic nature of graph streams presents a special challen... |

30 |
On biased reservoir sampling in the presence of stream evolution
- Aggarwal
- 2006
(Show Context)
Citation Context ...samples of the graph with specific structural criteria satisfying a general condition referred to as set monotonicity. While reservoir sampling is typically used in the context of the stream scenario =-=[5]-=-, [20] with particular temporal statistical properties, there is no known research on the topic of maintaining reservoir samples which leverage the underlying structural properties of graph streams. T... |

18 | On clustering graph streams
- AGGARWAL, ZHAO, et al.
(Show Context)
Citation Context ...social networks, IP networks and internet applications lead to the creation of massive streams of graph data. This has lead to an increasing interest in the problem of mining dynamic graphs [1], [2], =-=[4]-=-, [16], [21]. In a graph stream, we assume that individual graph objects are received continuously over time. Some examples of such application scenarios are as follows: • Many information network obj... |

10 | Gconnect: A connectivity index for massive disk-resident graphs
- Aggarwal, Yu
(Show Context)
Citation Context ... expose the outliers well. Since clustering is a very challenging problem for the stream scenario, we will create node partitions from samples of edges; it is well known that the use of edge sampling =-=[3]-=- to create such partitions are biased towards creating partitions which are dense. In order to further increase the likelihood of outlier exposure, we will use multiple choices of edge samples, in ord... |