Results 1 
4 of
4
Spotting suspicious link behavior with fbox: An adversarial perspective, 2014. arXiv preprint 1410.3915
"... Abstract—How can we detect suspicious users in large online networks? Online popularity of a user or product (via follows, pagelikes, etc.) can be monetized on the premise of higher ad clickthrough rates or increased sales. Web services and social networks which incentivize popularity thus suffer ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
(Show Context)
Abstract—How can we detect suspicious users in large online networks? Online popularity of a user or product (via follows, pagelikes, etc.) can be monetized on the premise of higher ad clickthrough rates or increased sales. Web services and social networks which incentivize popularity thus suffer from a major problem of fake connections from link fraudsters looking to make a quick buck. Typical methods of catching this suspicious behavior use spectral techniques to spot large groups of often blatantly fraudulent (but sometimes honest) users. However, smallscale, stealthy attacks may go unnoticed due to the nature of lowrank eigenanalysis used in practice. In this work, we take an adversarial approach to find and prove claims about the weaknesses of modern, stateoftheart spectral methods and propose FBOX, an algorithm designed to catch smallscale, stealth attacks that slip below the radar. Our algorithm has the following desirable properties: (a) it has theoretical underpinnings, (b) it is shown to be highly effective on real data and (c) it is scalable (linear on the input size). We evaluate FBOX on a large, public 41.7 million node, 1.5 billion edge whofollowswhom social graph from Twitter in 2010 and with high precision identify many suspicious accounts which have persisted without suspension even to this day. I.
ACCAMS: Additive CoClustering to Approximate Matrices Succinctly
"... Matrix completion and approximation are popular tools to capture a user’s preferences for recommendation and to approximate missing data. Instead of using lowrank factorization we take a drastically different approach, based on the simple insight that an additive model of coclusterings allows o ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Matrix completion and approximation are popular tools to capture a user’s preferences for recommendation and to approximate missing data. Instead of using lowrank factorization we take a drastically different approach, based on the simple insight that an additive model of coclusterings allows one to approximate matrices efficiently. This allows us to build a concise model that, per bit of model learned, significantly beats all factorization approaches in matrix completion. Even more surprisingly, we find that summing over small coclusterings is more effective in modeling matrices than classic coclustering, which uses just one large partitioning of the matrix. Following Occam’s razor principle, the fact that our model is more concise and yet just as accurate as more complex models suggests that it better captures the latent preferences and decision making processes present in the real world. We provide an iterative minimization algorithm, a collapsed Gibbs sampler, theoretical guarantees for matrix approximation, and excellent empirical evidence for the efficacy of our approach. We achieve stateoftheart results for matrix completion on Netflix at a fraction of the model complexity.
GraphBased User Behavior Modeling: From Prediction to Fraud Detection Perspective and Target Audience
"... Abstract How can we model users' preferences? How do anomalies, fraud, and spam effect our models of normal users? How can we modify our models to catch fraudsters? In this tutorial we will answer these questions connecting graph analysis tools for user behavior modeling to anomaly and fraud ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract How can we model users' preferences? How do anomalies, fraud, and spam effect our models of normal users? How can we modify our models to catch fraudsters? In this tutorial we will answer these questions connecting graph analysis tools for user behavior modeling to anomaly and fraud detection. In particular, we will focus on the application of subgraph analysis, label propagation, and latent factor models to static, evolving, and attributed graphs. For each of these techniques we will give a brief explanation of the algorithms and the intuition behind them. We will then give examples of recent research using the techniques to model, understand and predict normal behavior. With this intuition for how these methods are applied to graphs and user behavior, we will focus on stateoftheart research showing how the outcomes of these methods are effected by fraud, and how they have been used to catch fraudsters. Perspective and Target Audience Perspective: In this tutorial we focus on understanding anomaly and fraud detection through the lens of normal user behavior modeling. The data mining and machine learning communities have developed a plethora of models and methods for understanding user behavior. However, these methods generally assume that the behavior is that of real, honest people. On the other hand, fraud detection systems frequently use similar techniques as those used in modeling "normal" behavior, but are often framed as an independent problem. However, by focusing on the relations and intersections of the two perspectives we can gain a more complete understanding of the methods and hopefully inspire new research joining these two communities. Target Audience: This tutorial is aimed at anyone interested in modeling and understanding user behavior, from data mining and machine learning researchers to practitioners from industry and government. For those new to the field, the tutorial will cover the necessary background material to understand these systems and will offer a concise, intuitive overview of the stateoftheart. Additionally, the tutorial aims to offer a new perspective that will be valuable and interesting even for researchers with more experience in these domains. For those having worked in classic user behavior modeling, we will demonstrate how fraud can effect commonlyused models that expect normal behavior, with the hope that future models will directly account for fraud. For those having worked in fraud detection systems, we hope to inspire new research directions through connecting with recent developments in modeling "normal" behavior.
Elastic Distributed Bayesian Collaborative Filtering
"... In this paper, we consider learning a Bayesian collaborative filtering model on a shared cluster of commodity machines. Two main challenges arise: (1) How can we parallelize and distribute Bayesian collaborative filtering? (2) How can our distributed inference system handle elasticity events common ..."
Abstract
 Add to MetaCart
(Show Context)
In this paper, we consider learning a Bayesian collaborative filtering model on a shared cluster of commodity machines. Two main challenges arise: (1) How can we parallelize and distribute Bayesian collaborative filtering? (2) How can our distributed inference system handle elasticity events common in a shared, resource managed cluster, including resource rampup, preemption, and stragglers? To parallelize Bayesian inference, we adapt ideas from both matrix factorization partitioning schemes used with stochastic gradient descent and stale synchronous programming used with parameter servers. To handle elasticity events we offer a generalization of previous partitioning schemes that gives increased flexibility during system disruptions. We additionally describe two new scheduling algorithms to dynamically route work at runtime. In our experiments, we compare the effectiveness of both scheduling algorithms and demonstrate their robustness to system failure. 1