See this document in CiteSeerX!

On Comparing Classifiers: A Critique of Current Research and Methods (1995)  (Make Corrections)  (10 citations)
Steven L. Salzberg



  Home/Search   Context   Related

 
View or download:
tigr.org/~salzberg/critique.ps
jhu.edu/~sheppard/cs.60...paper2a.ps.gz
cs.technion.ac.il/~cs2367...critique.ps
Cached:  PS.gz  PS  PDF   Image  Update  Help

From:  tigr.org/~salzberg/ (more)
From:  jhu.edu/~sheppard/cs.605....sched
(Enter author homepages)

Rate this article: (best)
Paper not published in DMKD 1999.
View Comments (1)
(Enter summary)

Abstract: . An importantcomponent of many data mining projects is finding a good classification algorithm, a process that requires very careful thought about experimental design. If not done very carefully, comparative studies of classification and other types of algorithms can easily result in statistically invalid conclusions. This is especially true when one is using data mining techniques to analyze very large databases, which inevitably contain some statistically unlikely data. This paper describes... (Update)

Context of citations to this paper:   More

...how well those algorithms perform on standard data sets (e.g. UC Irvine repository [20] or in a particular domain. As pointed out in [28], such comparisons of algorithms may be misleading since the performance of the classifiers they produce depends strongly on the specific...

...a simple, easily understood, and wellstudied distribution. Its use for evaluation has been discussed for classifiers [1] 10] 15] 41] [42] and for evaluation and tuning in a variety of other domains, including inductive learning systems [18] and speech recognition...

Cited by:   More
Diversity in Ensemble Feature Selection - Tsymbal, Pechenizkiy, Cunningham (2003)   (Correct)
The PNC 2 Cluster Algorithm - An integrated learning algorithm.. - Haendel (2003)   (Correct)
A Stratified Methodology for Classifier and Recognizer.. - Micheals, Boult   (Correct)

Similar documents (at the sentence level):
79.8%:   On Comparing Classifiers: Pitfalls to Avoid and a Recommended.. - Salzberg (1997)   (Correct)

Active bibliography (related documents):   More   All
0.3:   Multiple Comparisons in Induction Algorithms - Jensen, Cohen (1998)   (Correct)
0.3:   Theory Refinement of Bayesian Networks with Hidden Variables - Ramachandran (1998)   (Correct)
0.3:   A Radial Basis Function Approach to Financial Time Series Analysis - Hutchinson (1994)   (Correct)

Similar documents based on text:   More   All
0.1:   Prediction of Transcription Terminators in Bacterial.. - Ermolaeva, Khalak.. (2000)   (Correct)
0.1:   Comment on "Setting Confidence Intervals for - Bounded Parameters By   (Correct)
0.1:   A Weighted Nearest Neighbor Algorithm for Learning with.. - Cost, Salzberg (1993)   (Correct)

Related documents from co-citation:   More   All
5:   A study of cross-validation and bootstrap for accuracy estimation and model sele.. - Kohavi - 1995
3:   A dilemma for fitness sharing with a scaling function - Darwen, Yao - 1995
3:   Multi-Interval Discretization of Continuous-Valued Attributes for Classification.. (context) - Fayyad, Irani - 1993

BibTeX entry:   (Update)

S. L. Salzberg. On comparing classifiers: A critique of current research and methods. Technical Report CS-1995-06, John Hopkins University, 1995. http://citeseer.ist.psu.edu/salzberg95comparing.html   More

@article{ salzbergsalzbergcomparing,
    author = "Steven Salzberg",
    title = "On Comparing Classifiers: {A} Critique of Current Research and Methods",
    url = "citeseer.ist.psu.edu/salzberg95comparing.html" }
Citations (may not include all citations):
256   Parallel networks that learn to pronounce english text (context) - Sejnowski, Rosenberg - 1987
216   Very simple classification rules perform well on most common.. (context) - Holte - 1993
203   Multi-interval discretization of continuous valued attribute.. (context) - Fayyad, Irani - 1993
84   A conservation law for generalization performance (context) - Schaffer - 1994
82   UCI repository of machine learning databases -- a machine-re.. (context) - Murphy - 1995
70   Predicting the secondary structure of globular proteins usin.. (context) - Qian, Sejnowski - 1988
56   Generalizing from case studies: A case study - Aha - 1992
55   Symbolic and neural learning algorithms: An experimental com.. (context) - Shavlik, Mooney et al. - 1991
45   An experimental comparison of the nearest-neighbor and neare.. - Wettschereck, Dietterich - 1995
45   the connection between in-sample testing and generalization .. (context) - Wolpert - 1992
43   Bayesian model selection in social research (context) - Raftery - 1995
41   Machine learning as an experimental science (context) - Kibler, Langley - 1988
20   Experimental Designs (context) - Cochran, Cox - 1957
19   A study of experimental evaluations of neural network algori.. - Prechelt - 1995
14   Which method learns most from the data (context) - Feelders, Verkooijen - 1995
7   Knowledge discovery through induction with randomization tes.. - Jensen - 1991
6   Statistical significance in inductive learning (context) - Gascuel, Caraux - 1992
3   Statistical tests for comparing supervised learning algorith.. (context) - Dietterich - 1996
3   Review of Economics and Statistics (context) - Denton, as - 1985
2   Statistical Thinking for Behavioral Scientists (context) - Hildebrand - 1986
2   Labeling space: A tool for thinking about significance testi.. (context) - Jensen - 1995
1   Overfitting in inductive learning algorithms: Why it occurs .. (context) - Jensen, Cohen - 1996



The graph only includes citing articles where the year of publication is known.


Documents on the same site (http://www.tigr.org/~salzberg/):   More
Best-Case Results for Nearest Neighbor Learning - Salzberg, Delcher, Heath, Kasif (1995)   (Correct)
Towards a Better Understanding of Memory-Based Reasoning Systems - Rachlin (1994)   (Correct)
Book Review: "C4.5: Programs for Machine Learning" by J. Ross.. - Salzberg (1994)   (Correct)

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC