Results 1 -
2 of
2
SARVAM: Search And RetrieVAl of Malware
"... We present SARVAM, a system for content-based Search And RetrieVAl of Malware. In contrast with traditional static or dynamic analysis, SARVAM uses malware binary con-tent to find similar malware. Given a malware query, a fin-gerprint is first computed based on transformed image fea-tures [19], and ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
We present SARVAM, a system for content-based Search And RetrieVAl of Malware. In contrast with traditional static or dynamic analysis, SARVAM uses malware binary con-tent to find similar malware. Given a malware query, a fin-gerprint is first computed based on transformed image fea-tures [19], and similar malware items from the database are then returned using image matching metrics. The current SARVAM database holds approximately 4.3 million samples of malware and benign executables. The system is demon-strated using a desktop computer with Ubuntu OS, and takes approximately 3 seconds per query to find the top match-ing malware. SARVAM has been operational for the past 15 months during which we have received approximately 212,000 queries from users. In this paper, we describe the design and implementation of SARVAM and also discuss the nature and statistics of queries received.
Research and Application of Data Mining Feature Selection Based on Relief Algorithm
"... Abstract—To choose the best features in data mining issues, the Relief Feature Selection Algorithm is proposed to implement the feature selection in this paper. Firstly, the data of Ionosphere from the UCI (University of California-Irvine) database is used to do a simulation experiment; secondly, th ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—To choose the best features in data mining issues, the Relief Feature Selection Algorithm is proposed to implement the feature selection in this paper. Firstly, the data of Ionosphere from the UCI (University of California-Irvine) database is used to do a simulation experiment; secondly, the proposed method is employed to do feature selection for voice signal. In this case study, the study starts from the 24-dimensional parameters of MFCC (Mel Frequency Cepstrum Coefficient), the most important parameters of MFCC can be found in the voice signal; then, the 24-dimensional parameters of MFCC can be combined and optimized in the case of recognition rate not much changed. The experimental results show that the method extracts out the best features. Therefore, the research provides a new direction to feature extraction for speech recognition process. Index Terms—relief algorithm, feature selection, data mining, speech recognition I.