| Prodromidis L., and Stolfo S. J. (1999) "Agent-Based Distributed Learning Applied to Fraud Detection," Technical Report, CUCS-014-99. |
....e#ective and e#cient meta classifiers. We evaluated the e#ectiveness of the proposed methods through experiments performed on real credit card data provided by two di#erent financial institutions, where the target application is to compute predictive models that detect fraudulent transactions [ Prodromidis Stolfo, 1999a ] Our empirical study presents and compares the results of the di#erent pruning techniques under three realistic evaluation metrics (accuracy, and a cost model fitted to the credit card fraud detection problem) and conducts an in depth analysis of the strengths and weaknesses of these ....
Prodromidis, A. L., and Stolfo, S. J. 1999a. Agent-based distributed learning applied to fraud detection. CUCS-014-99.
....by its performance (e.g accuracy, cost model) on a validation set. The final prediction is decided by summing over all weighted votes and by choosing the class with the highest aggregate. For a binary classification problem, for example, where each classifier C i with weight 4 Andreas L. Prodromidis and Salvatore J. Stolfo w i casts a 0 vote for class y 1 and a 1 vote for class y 2 , the aggregate is given by: S(x) # K i=1 w i C i (x) # K i=1 w i (1) If we choose 0.5 to be the threshold distinguishing classes y 1 and y 2 , the weighted voting method classifies unlabeled instances x as y 1 if S(x) 0.5, as y 2 if S(x) ....
....rate, even when pruning is as heavy as 80 . The last phase of the post training pruning algorithm aims to re combine the remaining base classifiers (those included in the decision tree model) using the original meta learning algorithm. To evaluate the e#ectiveness of the prun 14 Andreas L. Prodromidis and Salvatore J. Stolfo 0.875 0.88 0.885 0.89 0.895 0.9 10 20 30 40 50 60 70 80 90 100 Total Accuracy degree of pruning ( Accuracy of Chase Stacking meta classifiers Bayes stacking Ripper stacking CART stacking Decision tree for Ripper Decision tree for Bayes 0.945 0.95 0.955 0.96 0.965 0.97 10 20 30 40 50 60 ....
[Article contains additional citation context not shown here]
Prodromidis, A. L. and Stolfo, S. J. (1999a), Agent-based distributed learning applied to fraud detection. CUCS-014-99.
....on the evaluation of the post training pruning algorithm as a general method for reducing the size of an ensemble meta classifier. Detailed information on effective fraud detectors with extensive results (TP FP spread and cost model) from the mining of these credit card data sets can be found in [31]. 4.1 Computing Base Classifiers The first step involves the training of the base classifiers. We split each data set in 12 subsets and distribute them across six di#erent data sites (each site storing two subsets) Then we apply the 5 learning algorithms on each subset of data, therefore ....
....they are within 80 of the original performance. The throughput improvement in this case is 5.08 and 9.92 times better, respectively. In absolute numbers, the best meta classifier achieves on average 89.66 accuracy, 0. 632 TP FP spread and saves 903K per data subset under a realistic cost model [31] (maximum savings: 1,470) and the best First Union metaclassifiers achieves 96.53 accuracy, 0.848 TP FP spread and 950K per data subset (maximum savings: 1,085) In contrast, the best base classifier in Chase is 88.5 accurate, with 0.551 TP FP spread and 812K in savings while the best First ....
A. L. Prodromidis and S. J. Stolfo. Agent-based distributed learning applied to fraud detection. CUCS-014-99, 1999.
....and 15 versus 85 for First Union bank. The schemata (or feature sets) of the databases were developed over years of experience and continuous analysis by bank personnel to capture important information for fraud detection. We cannot reveal the details of the schema beyond what is described in [19]. The records have a fixed length of 137 bytes each and about 30 attributes including the binary class label (f n ) Some of the fields are numeric and the rest categorical, i.e. numbers were used to represent a few discrete categories. The features in this data defined by the banks essentially ....
A. L. Prodromidis and S. J. Stolfo. Agent-based distributed learning applied to fraud detection. In Sixteenth National Conference on Artificial Intelligence, 1999. Submitted for publication.
....and 15 versus 85 for First Union bank. The schemata (or feature sets) of the databases were developed over years of experience and continuous analysis by bank personnel to capture important information for fraud detection. We cannot reveal the details of the schema beyond what is described in [19]. The records have a fixed length of 137 bytes each and about 30 attributes including the binary class label (f n ) Some of the fields are numeric and the rest categorical, i.e. numbers were used to represent a few discrete categories. The features in this data defined by the banks essentially ....
A. L. Prodromidis and S. J. Stolfo. Agent-based distributed learning applied to fraud detection. In Sixteenth National Conference on Artificial Intelligence, 1999. Submitted for publication.
....on the evaluation of the post training pruning algorithm as a general method for reducing the size of an ensemble meta classifier. Detailed information on e#ective fraud detectors with extensive results (TP FP spread and cost model) from the mining of these credit card data sets can be found in [20]. 0.79 0.8 0.81 0.82 0.83 0.84 0.85 0.86 0.87 0.88 0.89 0 10 20 30 40 50 60 Base Classifiers Average accuracy of Chase base classifiers Bayes C45 CART ID3 Ripper 0.9 0.905 0.91 0.915 0.92 0.925 0.93 0.935 0.94 0.945 0.95 0.955 0 10 20 30 40 50 60 Base Classifiers Average accuracy of First ....
....Figure 5: Bar charts of the accuracy (black) TP FP (dark gray) savings (light gray) and throughput (very light gray) of the Chase (right) and First Union (left) meta classifiers as a function of the degree of pruning. TP FP spread and saves 903K per data subset under a realistic cost model [20] (maximum savings: 1,470) and the best First Union meta classifiers achieves 96.53 accuracy, 0.848 TP FP spread and 950K per data subset (maximum savings: 1,085) In contrast, the best base classifier in Chase is 88.5 accurate, with 0.551 TP FP spread and 812K in savings while the best First ....
A. L. Prodromidis and S. J. Stolfo. Agent-based distributed learning applied to fraud detection. In Sixteenth National Conference on Artificial Intelligence. Submitted for publication.
....personnel to capture important information for fraud detection. The records have a fixed length of 137 bytes each and about 30 numeric and categorical attributes including the binary class label (fraud legitimate transaction) We cannot reveal the details of the schema beyond what is described in [16]. Chase bank data consisted of 20 fraud and 80 legitimate transactions, whereas First Union data consisted of 15 versus 85 of fraud legitimate distribution. The learning task is to compute classifiers that correctly discern fraudulent from legitimate transactions. To evaluate and compare the ....
A. L. Prodromidis and S. J. Stolfo. Agent-based distributed learning applied to fraud detection. In Sixteenth National Conference on Artificial Intelligence. Submitted for publication.
No context found.
Prodromidis L., and Stolfo S. J. (1999) "Agent-Based Distributed Learning Applied to Fraud Detection," Technical Report, CUCS-014-99.
No context found.
Prodromidis, A. L. and Stolfo, S. J. (1999). Agent-Based Distributed Learning Applied to Fraud Detection. Technical Report CUCS-014-99, Department of Computer Science, Columbia University.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC