Results 1 - 10
of
11
Mining Knowledge-Sharing Sites for Viral Marketing
, 2002
"... Viral marketing takes advantage of networks of influence among customers to inexpensively achieve large changes in behavior. Our research seeks to put it on a firmer footing by mining these networks from data, building probabilistic models of them, and using these models to choose the best viral mar ..."
Abstract
-
Cited by 138 (7 self)
- Add to MetaCart
Viral marketing takes advantage of networks of influence among customers to inexpensively achieve large changes in behavior. Our research seeks to put it on a firmer footing by mining these networks from data, building probabilistic models of them, and using these models to choose the best viral marketing plan. Knowledge-sharing sites, where customers review products and advise each other, are a fertile source for this type of data mining. In this paper we extend our previous techniques, achieving a large reduction in computational cost, and apply them to data from a knowledge-sharing site. We optimize the amount of marketing funds spent on each customer, rather than just making a binary decision on whether to market to him. We take into account the fact that knowledge of the network is partial, and that gathering that knowledge can itself have a cost. Our results show the robustness and utility of our approach.
Obtaining calibrated probability estimates from decision trees and naive Bayesian classifiers
- In Proceedings of the Eighteenth International Conference on Machine Learning
, 2001
"... Accurate, well-calibrated estimates of class membership probabilities are needed in many supervised learning applications, in particular when a cost-sensitive decision must be made about examples with example-dependent costs. This paper presents simple but successful methods for obtaining calibrated ..."
Abstract
-
Cited by 77 (3 self)
- Add to MetaCart
Accurate, well-calibrated estimates of class membership probabilities are needed in many supervised learning applications, in particular when a cost-sensitive decision must be made about examples with example-dependent costs. This paper presents simple but successful methods for obtaining calibrated probability estimates from decision tree and naive Bayesian classifiers. Using the large and challenging KDD'98 contest dataset as a testbed, we report the results of a detailed experimental comparison of ten methods, according to four evaluation measures. We conclude that binning succeeds in significantly improving naive Bayesian probability estimates, while for improving decision tree probability estimates, we recommend smoothing by -estimation and a new variant of pruning that we call curtailment.
Configuration of Detection Software: A comparison of Decision and Game Theory Approaches. Decision Analysis
- Inside Internet Security
, 2004
"... Firms are increasingly relying on software to detect fraud in domains such as security, financial services, tax, and auditing. A fundamental problem in using detection software for fraud detection is achieving the optimal balance between the detection and false-positive rates. Many firms use decisio ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Firms are increasingly relying on software to detect fraud in domains such as security, financial services, tax, and auditing. A fundamental problem in using detection software for fraud detection is achieving the optimal balance between the detection and false-positive rates. Many firms use decision theory to address the configuration problem. Decision theory is based on the presumption that the firm’s actions do not influence the behavior of fraudsters. Game theory recognizes the fact that fraudsters do modify their strategies in response to firms ’ actions. In this paper, we compare decision and game theory approaches to the detection software configuration problem when firms are faced with strategic users. We find that under most circumstances firms incur lower costs when they use the game theory as opposed to the decision theory because the decision theory approach frequently either over- or underconfigures the detection software. However, firms incur the same or lower cost under the decision theory approach compared with the game theory approach in a simultaneousmove game if configurations under decision theory and game theory are sufficiently close. A limitation of the game theory approach is that it requires user-specific utility parameters, which are difficult to estimate. Decision theory, in contrast to game theory, requires the fraud probability estimate, which is more easily obtained. Key words: detection software; fraud detection; intrusion detection; false alarm rate; detection rate; ROC curve; decision theory; game theory
Recommendation methods for extending subscription periods
- In Proceedings of the Conference on Knowledge Discovery and Data Mining
, 2006
"... Online stores providing subscription services need to extend user subscription periods as long as possible to increase their profits. Conventional recommendation methods recommend items that best coincide with user’s interests to maximize the purchase probability, which does not necessarily contribu ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
Online stores providing subscription services need to extend user subscription periods as long as possible to increase their profits. Conventional recommendation methods recommend items that best coincide with user’s interests to maximize the purchase probability, which does not necessarily contribute to extend subscription periods. We present a novel recommendation method for subscription services that maximizes the probability of the subscription period being extended. Our method finds frequent purchase patterns in the long subscription period users, and recommends items for a new user to simulate the found patterns. Using survival analysis techniques, we efficiently extract information from the log data for finding the patterns. Furthermore, we infer user’s interests from purchase histories based on maximum entropy models, and use the interests to improve the recommendations. Since a longer subscription period is the result of greater user satisfaction, our method benefits users as well as online stores. We evaluate our method using the real log data of an online cartoon distribution service for cell-phone in Japan.
Scoring the Data Using Association Rules
- Applied Intelligence
, 2003
"... In many data mining applications, the objective is to select data cases of a target class. ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
In many data mining applications, the objective is to select data cases of a target class.
Identification of Influencers- Measuring Influence in Customer Networks
"... Viral marketing refers to marketing techniques that use social networks to produce increases in brand awareness through self-replicating viral diffusion of messages, analogous to the spread of pathological and computer viruses. The idea has successfully been used by marketers to reach a large number ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Viral marketing refers to marketing techniques that use social networks to produce increases in brand awareness through self-replicating viral diffusion of messages, analogous to the spread of pathological and computer viruses. The idea has successfully been used by marketers to reach a large number of customers rapidly. In case data about the customer network is available, centrality measures can be used in decision support systems to select influencers and spread viral marketing campaigns in a customer network. The literature on network theory describes a large number of such centrality measures. A critical question is which of these measures is best to select customers for a marketing campaign, an issue that little prior research has addressed. In this paper, we present the results of computational experiments based on real network data to compare different centrality measures for the diffusion of marketing messages. We found a significant lift when using central customers in message diffusion, but also found differences in the various centrality measures depending on the underlying network topology and diffusion process. More importantly, we found that in most cases the simple out-degree centrality outperforms almost all other measures. Only the SenderRank, a computationally much more complex measure that we introduce in this paper, achieved a comparable performance. Key words: customer relationship management, viral marketing, centrality, network theory 1.
Brier Curves: A New Cost-Based Visualisation of Classifier Performance
"... It is often necessary to evaluate classifier performance over a range of operating conditions, rather than as a point estimate. This is typically assessed through the construction of ‘curves’ over a ‘space’, visualising how one or two performance metrics vary with the operating condition. For binary ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
It is often necessary to evaluate classifier performance over a range of operating conditions, rather than as a point estimate. This is typically assessed through the construction of ‘curves’ over a ‘space’, visualising how one or two performance metrics vary with the operating condition. For binary classifiers in particular, cost space is a natural way of showing this range of performance, visualising loss against operating condition. However, the curves which have been traditionally drawn in cost space, known as cost curves, show the optimal loss, and hence assume knowledge of the optimal decision threshold for a given operating condition. Clearly, this leads to an optimistic assessment of classifier performance. In this paper we propose a more natural way of visualising classifier performance in cost space, which is to plot probabilistic loss on the y-axis, i.e., the loss arising from the probability estimates. This new curve provides new ways of understanding classifier performance and new tools to compare classifiers. In addition, we show that the area under this curve is exactly the Brier score, one of the most popular performance metrics for probabilistic classifiers.
i=1 −yilog(pi) − (1 − yi)log(1 − pi)) • Exponential loss / Boosting loss: [1]
"... There are the following metrics for evaluating the estimated probabilities: ..."
Abstract
- Add to MetaCart
There are the following metrics for evaluating the estimated probabilities:
Influential Marketing: A New Direct Marketing Strategy Addressing the Existence of Voluntary Buyers Examining Committee:
"... Name: ..."
Probabilistic user behavior models in online stores for recommender systems
"... Recommender systems are widely used in online stores because they are expected to improve both user convenience and online store profit. As such, a number of recommendation methods have been proposed in recent years. Functions required for recommender systems vary significantly depending on busines ..."
Abstract
- Add to MetaCart
Recommender systems are widely used in online stores because they are expected to improve both user convenience and online store profit. As such, a number of recommendation methods have been proposed in recent years. Functions required for recommender systems vary significantly depending on business models or/and situations. Although an online store can acquire various kinds of information about user behaviors such as purchase history and visiting time of users, this information has not yet been fully used to fulfill the diverse requirements. In this thesis, we propose probabilistic user behavior models for use in online stores to tailor recommender systems to diverse requirements efficiently using various kinds of user behavior information. The probabilistic model-based approach allows us to systematically integrate heterogeneous user behavior information using rules of the probability theory. In particular, we consider three requirements for recommender

