| J. Furnkranz. More efficient windowing. J. Artificial Intelligence Research, 8:129-- 164, 1998. |
....consider it as the process of increasing the speed of data mining. Although large data sets are necessary for reliable results, large databases are not necessarily advantages for the following reasons: ffl Not all data is informative [14, 19, 15] ffl High degree of redundancy in the databases [9, 11]. 2 ffl Experimental studies on the entire database are expensive [3] This is the basic problem in genebank collections and in drug industry. For example to conduct genetic studies on a single gene we need many resources and it is impossible to conduct studies on all of the genes. So ....
....mining algorithm by removing irrelevant attributes, which is achieved at the expense of reduced scalability to very large databases. 4.6 Sampling Sampling refers to the process of selecting a subset of the large database. Sampling is an efficient and effective approach to speed up data mining [12, 11]. The fundamental principle of sampling in data mining is summarized in the following hypothesis: Sampling hypothesis: A data mining algorithm equipped with a sampling method constructs, in a lesser time and fewer resources, a theory that is no worse in predictive accuracy to that obtained from ....
J. Furnkranz. More efficient windowing. J. Artificial Intelligence Research, 8:129-- 164, 1998.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC