Results 1 -
2 of
2
Apolo: Making Sense of Large Network Data by Combining Rich User Interaction and Machine Learning
"... Extracting useful knowledge from large network datasets has become a fundamental challenge in many domains, from scientific literature to social networks and the web. We introduce Apolo, a system that uses a mixed-initiative approach— combining visualization, rich user interaction and machine learni ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Extracting useful knowledge from large network datasets has become a fundamental challenge in many domains, from scientific literature to social networks and the web. We introduce Apolo, a system that uses a mixed-initiative approach— combining visualization, rich user interaction and machine learning—to guide the user to incrementally and interactively explore large network data and make sense of it. Apolo engages the user in bottom-up sensemaking to gradually build up an understanding over time by starting small, rather than starting big and drilling down. Apolo also helps users find relevant information by specifying exemplars, and then using a machine learning method called Belief Propagation to infer which other nodes may be of interest. We evaluated Apolo with twelve participants in a between-subjects study, with the task being to find relevant new papers to update an existing survey paper. Using expert judges, participants using Apolo found significantly more relevant papers. Subjective feedback of Apolo was also very positive.
Polonium: Tera-Scale Graph Mining and Inference for Malware Detection
- SIAM INTERNATIONAL CONFERENCE ON DATA MINING (SDM)
, 2011
"... We present Polonium, a novel Symantec technology that detects malware through large-scale graph inference. Based on the scalable Belief Propagation algorithm, Polonium infers every file’s reputation, flagging files with low reputation as malware. We evaluated Polonium with a billion-node graph const ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
We present Polonium, a novel Symantec technology that detects malware through large-scale graph inference. Based on the scalable Belief Propagation algorithm, Polonium infers every file’s reputation, flagging files with low reputation as malware. We evaluated Polonium with a billion-node graph constructed from the largest file submissions dataset ever published (60 terabytes). Polonium attained a high true positive rate of 87 % in detecting malware; in the field, Polonium lifted the detection rate of existing methods by 10 absolute percentage points. We detail Polonium’s design and implementation features instrumental to its success. Polonium has served 120 million people and helped answer more than one trillion queries for file reputation.

