Results 1 - 10
of
96
Anomaly Detection: A Survey
, 2007
"... Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and c ..."
Abstract
-
Cited by 540 (5 self)
- Add to MetaCart
(Show Context)
Anomaly detection is an important problem that has been researched within diverse research areas and application domains. Many anomaly detection techniques have been specifically developed for certain application domains, while others are more generic. This survey tries to provide a structured and comprehensive overview of the research on anomaly detection. We have grouped existing techniques into different categories based on the underlying approach adopted by each technique. For each category we have identified key assumptions, which are used by the techniques to differentiate between normal and anomalous behavior. When applying a given technique to a particular domain, these assumptions can be used as guidelines to assess the effectiveness of the technique in that domain. For each category, we provide a basic anomaly detection technique, and then show how the different existing techniques in that category are variants of the basic technique. This template provides an easier and succinct understanding of the techniques belonging to each category. Further, for each category, we identify the advantages and disadvantages of the techniques in that category. We also provide a discussion on the computational complexity of the techniques since it is an important issue in real application domains. We hope that this survey will provide a better understanding of the di®erent directions in which research has been done on this topic, and how techniques developed in one area can be applied in domains for which they were not intended to begin with.
Toward integrating feature selection algorithms for classification and clustering
- IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
, 2005
"... This paper introduces concepts and algorithms of feature selection, surveys existing feature selection algorithms for classification and clustering, groups and compares different algorithms with a categorizing framework based on search strategies, evaluation criteria, and data mining tasks, reveals ..."
Abstract
-
Cited by 267 (21 self)
- Add to MetaCart
(Show Context)
This paper introduces concepts and algorithms of feature selection, surveys existing feature selection algorithms for classification and clustering, groups and compares different algorithms with a categorizing framework based on search strategies, evaluation criteria, and data mining tasks, reveals unattempted combinations, and provides guidelines in selecting feature selection algorithms. With the categorizing framework, we continue our efforts toward building an integrated system for intelligent feature selection. A unifying platform is proposed as an intermediate step. An illustrative example is presented to show how existing feature selection algorithms can be integrated into a meta algorithm that can take advantage of individual algorithms. An added advantage of doing so is to help a user employ a suitable algorithm without knowing details of each algorithm. Some real-world applications are included to demonstrate the use of feature selection in data mining. We conclude this work by identifying trends and challenges of feature selection research and development.
Efficient feature selection via analysis of relevance and redundancy
- Journal of Machine Learning Research
, 2004
"... Feature selection is applied to reduce the number of features in many applications where data has hundreds or thousands of features. Existing feature selection methods mainly focus on finding relevant features. In this paper, we show that feature relevance alone is insufficient for efficient feature ..."
Abstract
-
Cited by 209 (3 self)
- Add to MetaCart
(Show Context)
Feature selection is applied to reduce the number of features in many applications where data has hundreds or thousands of features. Existing feature selection methods mainly focus on finding relevant features. In this paper, we show that feature relevance alone is insufficient for efficient feature selection of high-dimensional data. We define feature redundancy and propose to perform explicit redundancy analysis in feature selection. A new framework is introduced that decouples relevance analysis and redundancy analysis. We develop a correlation-based method for relevance and redundancy analysis, and conduct an empirical study of its efficiency and effectiveness comparing with representative methods.
Data Mining
- TO APPEAR IN THE HANDBOOK OF TECHNOLOGY MANAGEMENT, H. BIDGOLI (ED.)
, 2010
"... The amount of data being generated and stored is growing exponentially, due in large part to the continuing advances in computer technology. This presents tremendous opportunities for those who can ..."
Abstract
-
Cited by 48 (1 self)
- Add to MetaCart
The amount of data being generated and stored is growing exponentially, due in large part to the continuing advances in computer technology. This presents tremendous opportunities for those who can
Hybrid Intrusion Detection with Weighted Signature Generation over Anomalous Internet Episodes
- IEEE Trans. on Dependable and Secure Computing, Vol.4, No.1, January-March 2007
"... Abstract—This paper reports the design principles and evaluation results of a new experimental hybrid intrusion detection system (HIDS). This hybrid system combines the advantages of low false-positive rate of signature-based intrusion detection system (IDS) and the ability of anomaly detection syst ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
(Show Context)
Abstract—This paper reports the design principles and evaluation results of a new experimental hybrid intrusion detection system (HIDS). This hybrid system combines the advantages of low false-positive rate of signature-based intrusion detection system (IDS) and the ability of anomaly detection system (ADS) to detect novel unknown attacks. By mining anomalous traffic episodes from Internet connections, we build an ADS that detects anomalies beyond the capabilities of signature-based SNORT or Bro systems. A weighted signature generation scheme is developed to integrate ADS with SNORT by extracting signatures from anomalies detected. HIDS extracts signatures from the output of ADS and adds them into the SNORT signature database for fast and accurate intrusion detection. By testing our HIDS scheme over real-life Internet trace data mixed with 10 days of Massachusetts Institute of Technology/ Lincoln Laboratory (MIT/LL) attack data set, our experimental results show a 60 percent detection rate of the HIDS, compared with 30 percent and 22 percent in using the SNORT and Bro systems, respectively. This sharp increase in detection rate is obtained with less than 3 percent false alarms. The signatures generated by ADS upgrade the SNORT performance by 33 percent. The HIDS approach proves the vitality of detecting intrusions and anomalies, simultaneously, by automated data mining and signature generation over Internet connection episodes.
Detection and classification of intrusions and faults using sequences of system calls
- ACM SIGMOD Record
, 2001
"... This paper investigates the use of sequences of system calls for classifying intrusions and faults induced by privileged processes in Unix. Classification is an essential capability for responding to an anomaly (attack or fault), since it gives the ability to associate appropriate responses to each ..."
Abstract
-
Cited by 22 (0 self)
- Add to MetaCart
(Show Context)
This paper investigates the use of sequences of system calls for classifying intrusions and faults induced by privileged processes in Unix. Classification is an essential capability for responding to an anomaly (attack or fault), since it gives the ability to associate appropriate responses to each anomaly type. Previous work using the well known dataset from the University of New Mexico (UNM) has demonstrated the usefulness of monitoring sequences of system calls for detecting anomalies induced by processes corresponding to several Unix Programs, such as sendmail, lpr, ftp, etc. Specifically, previous work has shown that the Anomaly Count of a running process, i.e., the number of sequences spawned by the process which are not found in the corresponding dictionary of normal activity for the Program, is a valuable feature for anomaly detection. To achieve Classification, in this paper we introduce the concept of Anomaly Dictionaries, which are the sets of anomalous sequences for each type of anomaly. It is verified that Anomaly Dictionaries for the UNM’s sendmail Program have very little overlap, and can be effectively
Data Mining Methods for Network Intrusion Detection
, 2004
"... Network intrusion detection systems have become a standard component in security infrastructures. Unfortunately, current systems are poor at detecting novel attacks without an unacceptable level of false alarms. We propose that the solution to this problem is the application of an ensemble of data m ..."
Abstract
-
Cited by 22 (1 self)
- Add to MetaCart
(Show Context)
Network intrusion detection systems have become a standard component in security infrastructures. Unfortunately, current systems are poor at detecting novel attacks without an unacceptable level of false alarms. We propose that the solution to this problem is the application of an ensemble of data mining techniques which can be applied to network connection data in an offline environment, augmenting existing real-time sensors. In this paper, we expand on our motivation, particularly with regard to running in an offline environment, and our interest in multisensor and multimethod correlation. We then review existing systems, from commercial systems, to research based intrusion detection systems. Next we survey the state of the art in the area. Standard datasets and feature extraction turned out to be more important than we had initially anticipated, so each can be found under its own heading. Next, we review the actual data mining methods that have been proposed or implemented. We conclude by summarizing the open problems in this area, along with some questions of a broader scope. We hope that by providing the motivation and summarizing the work in this area that we can stimulate further research.
I know my network: collaboration and expertise in intrusion detection,” in ACM conference on Computer supported cooperative work
, 2004
"... The work of intrusion detection (ID) in accomplishing network security is complex, requiring highly sought-after expertise. While limited automation exists, the role of human ID analysts remains crucial. This paper presents the results of an exploratory field study examining the role of expertise an ..."
Abstract
-
Cited by 18 (2 self)
- Add to MetaCart
(Show Context)
The work of intrusion detection (ID) in accomplishing network security is complex, requiring highly sought-after expertise. While limited automation exists, the role of human ID analysts remains crucial. This paper presents the results of an exploratory field study examining the role of expertise and collaboration in ID work. Through an analysis of the common and situated expertise required in ID work, our results counter basic assumptions about its individualistic character, revealing significant distributed collaboration. Current ID support tools provide no support for this collaborative problem solving. The results of this research highlight ID as an engaging CSCW work domain, one rich with organizational insights, design challenges, and practical import.
The Work of Intrusion Detection: Rethinking The Role of Security Analysts
, 2004
"... Intrusion detection (ID) systems have become increasingly accepted as an essential layer in the information security infrastructure. However, there has been little research into understanding the human component of ID work. Currently, security analysts face an increasing workload as their environmen ..."
Abstract
-
Cited by 16 (3 self)
- Add to MetaCart
Intrusion detection (ID) systems have become increasingly accepted as an essential layer in the information security infrastructure. However, there has been little research into understanding the human component of ID work. Currently, security analysts face an increasing workload as their environments expand and attacks become more frequent. We conducted contextual interviews with security analysts to gain an understanding of the people and work of ID. Our findings reveal that organizational changes must be combined with improved technical tools for effective, long-term solutions to the difficulties of scaling ID work. We propose a three-phase task model in which tasks could be decoupled according to requisite expertise. In particular, monitoring tasks can be separated and staffed by less experienced ID analysts with corresponding tool support. Thus, security analysts will be better able to cope with increasing security threats in their expanding networks. Additionally, organizations will be afforded more flexibility in hiring and training new analysts.