Results 1 - 10
of
540
Outside the Closed World: On Using Machine Learning For Network Intrusion Detection
- In Proceedings of the IEEE Symposium on Security and Privacy
, 2010
"... Abstract—In network intrusion detection research, one popular strategy for finding attacks is monitoring a network’s activity for anomalies: deviations from profiles of normality previously learned from benign traffic, typically identified using tools borrowed from the machine learning community. Ho ..."
Abstract
-
Cited by 82 (1 self)
- Add to MetaCart
(Show Context)
Abstract—In network intrusion detection research, one popular strategy for finding attacks is monitoring a network’s activity for anomalies: deviations from profiles of normality previously learned from benign traffic, typically identified using tools borrowed from the machine learning community. However, despite extensive academic research one finds a striking gap in terms of actual deployments of such systems: compared with other intrusion detection approaches, machine learning is rarely employed in operational “real world ” settings. We examine the differences between the network intrusion detection problem and other areas where machine learning regularly finds much more success. Our main claim is that the task of finding attacks is fundamentally different from these other applications, making it significantly harder for the intrusion detection community to employ machine learning effectively. We support this claim by identifying challenges particular to network intrusion detection, and provide a set of guidelines meant to strengthen future research on anomaly detection. Keywords-anomaly detection; machine learning; intrusion detection; network security. I.
Robust Shrinkage Estimation of High-dimensional Covariance Matrices
, 2010
"... We address high dimensional covariance estimation for elliptical distributed samples. Specifically we consider shrinkage methods that are suitable for high dimensional problems with a small number of samples (large p small n). We start from a classical robust covariance estimator [Tyler(1987)], whi ..."
Abstract
-
Cited by 37 (7 self)
- Add to MetaCart
We address high dimensional covariance estimation for elliptical distributed samples. Specifically we consider shrinkage methods that are suitable for high dimensional problems with a small number of samples (large p small n). We start from a classical robust covariance estimator [Tyler(1987)], which is distribution-free within the family of elliptical distribution but inapplicable when n < p. Using a shrinkage coefficient, we regularize Tyler’s fixed point iteration. We derive the minimum mean-squared-error shrinkage coefficient in closed form. The closed form expression is a function of the unknown true covariance and cannot be implemented in practice. Instead, we propose a plug-in estimate to approximate it. Simulations demonstrate that the proposed method achieves low estimation error and is robust to heavy-tailed samples.
On Community Outliers and their Efficient Detection in Information Networks
- KDD'10
, 2010
"... Linked or networked data are ubiquitous in many applications. Examples include web data or hypertext documents connected via hyperlinks, social networks or user profiles connected via friend links, co-authorship and citation information, blog data, movie reviews and so on. In these datasets (called ..."
Abstract
-
Cited by 35 (9 self)
- Add to MetaCart
(Show Context)
Linked or networked data are ubiquitous in many applications. Examples include web data or hypertext documents connected via hyperlinks, social networks or user profiles connected via friend links, co-authorship and citation information, blog data, movie reviews and so on. In these datasets (called “information networks”), closely related objects that share the same properties or interests form a community. For example, a community in blogsphere could be users mostly interested in cell phone reviews and news. Outlier detection in information networks can reveal important anomalous and interesting behaviors that are not obvious if community information is ignored. An example could be a low-income person being friends with many rich people even though his income is not anomalously low when considered over the entire population. This paper first introduces the concept of community outliers (interesting points or rising stars for a more positive sense), and then shows that wellknown baseline approaches without considering links or community information cannot find these community outliers. We propose an efficient solution by modeling networked data as a mixture model composed of multiple normal communities and a set of randomly generated outliers. The probabilistic model characterizes both data and links simultaneously by defining their joint distribution based on hidden Markov random fields (HMRF). Maximizing the data likelihood and the posterior of the model gives the solution to the outlier inference problem. We apply the model on both
Nonparametric Divergence Estimation with Applications to Machine Learning on Distributions
"... Low-dimensional embedding, manifold learning, clustering, classification, and anomaly detection are among the most important problems in machine learning. The existing methods usually consider the case when each instance has a fixed, finite-dimensional feature representation. Here we consider a diff ..."
Abstract
-
Cited by 24 (12 self)
- Add to MetaCart
(Show Context)
Low-dimensional embedding, manifold learning, clustering, classification, and anomaly detection are among the most important problems in machine learning. The existing methods usually consider the case when each instance has a fixed, finite-dimensional feature representation. Here we consider a different setting. We assume that each instance corresponds to a continuous probability distribution. These distributions are unknown, but we are given some i.i.d. samples from each distribution. Our goal is to estimate the distances between these distributions and use these distances to perform low-dimensional embedding, clustering/classification, or anomaly detection for the distributions. We present estimation algorithms, describe how to apply them for machine learning tasks on distributions, and show empirical results on synthetic data, real word images, and astronomical data sets. 1
Integrating community matching and outlier detection for mining evolutionary community outliers
- In KDD
, 2012
"... Temporal datasets, in which data evolves continuously, exist in a wide variety of applications, and identifying anomalous or outlying objects from temporal datasets is an important and challenging task. Different from traditional outlier detection, which detects objects that have quite different beh ..."
Abstract
-
Cited by 21 (11 self)
- Add to MetaCart
(Show Context)
Temporal datasets, in which data evolves continuously, exist in a wide variety of applications, and identifying anomalous or outlying objects from temporal datasets is an important and challenging task. Different from traditional outlier detection, which detects objects that have quite different behavior compared with the other objects, temporal outlier detection tries to identify objects that have different evolutionary behavior compared with other objects. Usually objects form multiple communities, and most of the objects belonging to the same community follow similar patterns of evolution. However, there are some objects which evolve in a very different way relative to other community members, and we define such objects as evolutionary community outliers. This definition represents a novel type of outliers considering both temporal dimension and community patterns. We investigate the problem of identifying evolutionary community outliers given the discovered communities from two snapshots of an evolving dataset. To tackle the challenges of community evolution and outlier detection, we propose an integrated optimization framework which conducts outlier-aware community matching across snapshots and identification of evolutionary outliers in a tightly coupled way. A coordinate descent algorithm is proposed to improve community matching and outlier detection performance iteratively. Experimental results on both synthetic and real datasets show that the proposed approach is highly effective in discovering interesting evolutionary community outliers.
Deepwalk: Online learning of social representations. arXiv preprint arXiv:1403.6652
, 2014
"... We present DeepWalk, a novel approach for learning la-tent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. Deep-Walk generalizes recent advancements in language mod-eling and ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
(Show Context)
We present DeepWalk, a novel approach for learning la-tent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. Deep-Walk generalizes recent advancements in language mod-eling and unsupervised feature learning (or deep learning) from sequences of words to graphs. DeepWalk uses local information obtained from trun-cated random walks to learn latent representations by treat-ing walks as the equivalent of sentences. We demonstrate DeepWalk’s latent representations on several multi-label network classification tasks for social networks such as Blog-Catalog, Flickr, and YouTube. Our results show that Deep-Walk outperforms challenging baselines which are allowed a global view of the network, especially in the presence of missing information. DeepWalk’s representations can pro-vide F1 scores up to 10 % higher than competing methods when labeled data is sparse. In some experiments, Deep-Walk’s representations are able to outperform all baseline methods while using 60 % less training data. DeepWalk is also scalable. It is an online learning algo-rithm which builds useful incremental results, and is trivially parallelizable. These qualities make it suitable for a broad class of real world applications such as network classifica-tion, and anomaly detection.
METRICFORENSICS: A Multi-Level Approach for Mining Volatile Graphs ∗
"... Advances in data collection and storage capacity have made it increasingly possible to collect highly volatile graph data for analysis. Existing graph analysis techniques are not appropriate for such data, especially in cases where streaming or near-real-time results are required. An example that ha ..."
Abstract
-
Cited by 16 (6 self)
- Add to MetaCart
(Show Context)
Advances in data collection and storage capacity have made it increasingly possible to collect highly volatile graph data for analysis. Existing graph analysis techniques are not appropriate for such data, especially in cases where streaming or near-real-time results are required. An example that has drawn significant research interest is the cyber-security domain, where internet communication traces are collected and real-time discovery of events, behaviors, patterns, and anomalies is desired. We propose MetricForensics, a scalable framework for analysis of volatile graphs. MetricForensics combines a multi-level “drill down ” approach, a collection of user-selected graph metrics, and a collection of analysis techniques. At each successive level, more sophisticated metrics are computed and the graph is viewed at finer temporal resolutions. In this way, MetricForensics scales to highly volatile graphs by only allocating resources for computationally expensive analysis when an interesting event is discovered at a coarser resolution first. We test MetricForensics on three real-world graphs: an enterprise IP trace, a trace of legitimate and malicious network traffic from a research institution, and the MIT Reality Mining proximity sensor data. Our largest graph has ∼3M vertices and ∼32M edges, spanning 4.5 days. The results demonstrate the scalability and capability of MetricForensics in analyzing volatile graphs; and highlight four novel phenomena in such graphs: elbows, broken correlations, prolonged spikes, and lightweight stars. This work was performed under the auspices of the U.S.
Gaussian process regression flow for analysis of motion trajectories
- In Proceedings of IEEE ICCV. IEEE Computer Society
, 2011
"... Recognition of motions and activities of objects in videos requires effective representations for analysis and matching of motion trajectories. In this paper, we introduce a new representation specifically aimed at matching motion trajectories. We model a trajectory as a continuous dense flow field ..."
Abstract
-
Cited by 15 (1 self)
- Add to MetaCart
(Show Context)
Recognition of motions and activities of objects in videos requires effective representations for analysis and matching of motion trajectories. In this paper, we introduce a new representation specifically aimed at matching motion trajectories. We model a trajectory as a continuous dense flow field from a sparse set of vector sequences using Gaussian Process Regression. Furthermore, we introduce a random sampling strategy for learning stable classes of motions from limited data. Our representation allows for incrementally predicting possible paths and detecting anomalous events from online trajectories. This representation also supports matching of complex motions with acceleration changes and pauses or stops within a trajectory. We use the proposed approach for classifying and predicting motion trajectories in traffic monitoring domains and test on several data sets. We show that our approach works well on various types of complete and incomplete trajectories from a variety of video data sets with different frame rates. 1.
Histogram-Based Traffic Anomaly Detection
"... Identifying network anomalies is essential in enterprise and provider networks for diagnosing events, like attacks or failures, that severely impact performance, security, and Service Level Agreements (SLAs). Feature-based anomaly detection models (ab)normal network traffic behavior by analyzing dif ..."
Abstract
-
Cited by 15 (2 self)
- Add to MetaCart
(Show Context)
Identifying network anomalies is essential in enterprise and provider networks for diagnosing events, like attacks or failures, that severely impact performance, security, and Service Level Agreements (SLAs). Feature-based anomaly detection models (ab)normal network traffic behavior by analyzing different packet header features, like IP addresses and port numbers. In this work, we describe a new approach to feature-based anomaly detection that constructs histograms of different traffic features, models histogram patterns, and identifies deviations from the created models. We assess the strengths and weaknesses of many design options, like the utility of different features, the construction of feature histograms, the modeling and clustering algorithms, and the detection of deviations. Compared to previous feature-based anomaly detection approaches, our work differs by constructing detailed histogram models, rather than using coarse entropy-based distribution approximations. We evaluate histogram-based anomaly detection and compare it to previous approaches using collected network traffic traces. Our results demonstrate the effectiveness of our technique in identifying a wide range of anomalies. The assessed technical details are generic and, therefore, we expect that the derived insights will be useful for similar future research efforts.