• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 202,515
Next 10 →

Empirical evaluation of feature subset selection based on a real world data set

by Petra Perner, Chid Apte - In Proc. PKDD-2000 (LNAI 1910 , 2000
"... Abstract. Selecting the right set of features for classification is one of the most important problems in designing a good classifier. Decision tree induction algorithms such as C4.5 have incorporated in their learning phase an automatic feature selection strategy while some other statistical classi ..."
Abstract - Cited by 4 (0 self) - Add to MetaCart
classification algorithm require the feature subset to be selected in a preprocessing phase. It is well know that correlated and irrelevant features may degrade the performance of the C4.5 algorithm. In our study, we evaluated the influence of feature pre-selection on the prediction accuracy of C4.5 using a real-world

Autonomous Robots manuscript No. (will be inserted by the editor) Comparing ICP Variants on Real-World Data Sets

by Open-source Library, Francis Colas Rol, Siegwart Stéphane, Francis Colas, Roland Siegwart
"... protocol. ..."
Abstract - Add to MetaCart
Abstract not found

Parallel Evolutionary Algorithms with SOM-Like Migration and their Application to Real World Data Sets

by Th. Villmann, R. Haupt, K. Hering, H. Schulze
"... We introduce a multiple subpopulation approach for parallel evolutionary algorithms the migration scheme of which follows a SOM-like dynamics. We succesfully apply this approach to clustering in both VLSI-design and psychotherapy research. The advantages of the approach are shown which consist in a ..."
Abstract - Add to MetaCart
We introduce a multiple subpopulation approach for parallel evolutionary algorithms the migration scheme of which follows a SOM-like dynamics. We succesfully apply this approach to clustering in both VLSI-design and psychotherapy research. The advantages of the approach are shown which consist in a reduced communication overhead between the subpopulations preserving a non-vanishing information °ow. 1

Based Clustering Over Data Stream

by Charlie Isaksson, Margaret Dunham, Michael Hahsler, Synthetic Data
"... • Real-world data set • Sensitivity to parameters • Scalability and complexity ..."
Abstract - Add to MetaCart
• Real-world data set • Sensitivity to parameters • Scalability and complexity

K.B.: Multi-Interval Discretization of Continuous-Valued Attributes for Classication Learning. In:

by Keki B Irani , Usama M Fayyad - IJCAI. , 1993
"... Abstract Since most real-world applications of classification learning involve continuous-valued attributes, properly addressing the discretization process is an important problem. This paper addresses the use of the entropy minimization heuristic for discretizing the range of a continuous-valued a ..."
Abstract - Cited by 832 (7 self) - Add to MetaCart
formally derive a criterion based on the minimum description length principle for deciding the partitioning of intervals. We demonstrate via empirical evaluation on several real-world data sets that better decision trees are obtained using the new multi-interval algorithm.

Power-law distributions in empirical data

by Aaron Clauset, Cosma Rohilla Shalizi, M. E. J. Newman - ISSN 00361445. doi: 10.1137/ 070710111. URL http://dx.doi.org/10.1137/070710111 , 2009
"... Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the t ..."
Abstract - Cited by 607 (7 self) - Add to MetaCart
demonstrate these methods by applying them to twentyfour real-world data sets from a range of different disciplines. Each of the data sets has been conjectured previously to follow a power-law distribution. In some cases we find these conjectures to be consistent with the data while in others the power law

SMOTE: Synthetic Minority Over-sampling Technique

by Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, W. Philip Kegelmeyer - Journal of Artificial Intelligence Research , 2002
"... An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of ``normal'' examples with only a small percentag ..."
Abstract - Cited by 634 (27 self) - Add to MetaCart
An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of ``normal'' examples with only a small

Toward Optimal Active Learning through Sampling Estimation of Error Reduction

by Nicholas Roy, Andrew Mccallum - In Proc. 18th International Conf. on Machine Learning , 2001
"... This paper presents an active learning method that directly optimizes expected future error. This is in contrast to many other popular techniques that instead aim to reduce version space size. These other methods are popular because for many learning models, closed form calculation of the expec ..."
Abstract - Cited by 353 (2 self) - Add to MetaCart
of the expected future error is intractable. Our approach is made feasible by taking a sampling approach to estimating the expected reduction in error due to the labeling of a query. In experimental results on two real-world data sets we reach high accuracy very quickly, sometimes with four times fewer

Rough Sets.

by Zdzis Law Pawlak , George Allen , Unwin , ; W W London , New Norton , York - Int. J. of Information and Computer Sciences , 1982
"... Abstract. This article presents some general remarks on rough sets and their place in general picture of research on vagueness and uncertainty -concepts of utmost interest, for many years, for philosophers, mathematicians, logicians and recently also for computer scientists and engineers particular ..."
Abstract - Cited by 793 (13 self) - Add to MetaCart
particularly those working in such areas as AI, computational intelligence, intelligent systems, cognitive science, data mining and machine learning. Thus this article is intended to present some philosophical observations rather than to consider technical details or applications of rough set theory. Therefore

Statistical Comparisons of Classifiers over Multiple Data Sets

by Janez Demsar , 2006
"... While methods for comparing two learning algorithms on a single data set have been scrutinized for quite some time already, the issue of statistical tests for comparisons of more algorithms on multiple data sets, which is even more essential to typical machine learning studies, has been all but igno ..."
Abstract - Cited by 744 (0 self) - Add to MetaCart
While methods for comparing two learning algorithms on a single data set have been scrutinized for quite some time already, the issue of statistical tests for comparisons of more algorithms on multiple data sets, which is even more essential to typical machine learning studies, has been all
Next 10 →
Results 1 - 10 of 202,515
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University