• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

An Information-Theoretic Approach to Data Mining (1999)

by M Last, O Maimon
Add To MetaCart

Tools

Sorted by:
Results 1 - 3 of 3

Anytime Algorithm for Feature Selection

by Mark Last , Abraham Kandel, Oded Maimon, Eugene Eberbach
"... Feature selection is used to improve performance of learning algorithms by finding a minimal subset of relevant features. Since the process of feature selection is computationally intensive, a trade-off between the quality of the selected subset and the computation time is required. In this paper, ..."
Abstract - Cited by 6 (2 self) - Add to MetaCart
Feature selection is used to improve performance of learning algorithms by finding a minimal subset of relevant features. Since the process of feature selection is computationally intensive, a trade-off between the quality of the selected subset and the computation time is required. In this paper, we are presenting a novel, anytime algorithm for feature selection, which gradually improves the quality of results by increasing the computation time. The algorithm is interruptible, i.e., it can be stopped at any time and provide a partial subset of selected features. The quality of results is monitored by a new measure: fuzzy information gain. The algorithm performance is evaluated on several benchmark datasets.

Automated Perceptions in Data Mining

by Mark Last, Abraham Kandel , 1999
"... Visualization is known to be one of the most efficient data mining approaches. The human eye can capture complex patterns and relationships, along with detecting the outlying (exceptional) cases in a data set. The main limitation of the visual data analysis is its poor scalability: it is hardly appl ..."
Abstract - Cited by 5 (4 self) - Add to MetaCart
Visualization is known to be one of the most efficient data mining approaches. The human eye can capture complex patterns and relationships, along with detecting the outlying (exceptional) cases in a data set. The main limitation of the visual data analysis is its poor scalability: it is hardly applicable to data sets of high dimensionality. We use the concepts of Fuzzy Set Theory to automate the process of human perception. The automated tasks include comparison of frequency distributions, evaluating reliability of dependent variables, and detecting outliers in noisy data. Multiple perceptions (related to different users) can be represented by adjusting the parameters of the fuzzy membership functions. The applicability of automated perceptions is demonstrated on several real-world data sets. Keywords: Data mining, Fuzzy set theory, data visualization, data perception, rule extraction. 1. Introduction Fayyad et al. [3] have defined the process of knowledge discovery in databases (K...

Knowledge Discovery in Mortality Records: An Info-Fuzzy Approach

by Mark Last, Oded Maimon, Abraham Kandel, Abraham K
"... Introduction Causes of death are a constant subject of medical and demographic research. The issues of interest include the change of the leading causes over time, geographical patterns of deadly diseases, life expectancy, tracking diseases to genetic factors, and many others (Blij and Murphy, 1998 ..."
Abstract - Cited by 1 (0 self) - Add to MetaCart
Introduction Causes of death are a constant subject of medical and demographic research. The issues of interest include the change of the leading causes over time, geographical patterns of deadly diseases, life expectancy, tracking diseases to genetic factors, and many others (Blij and Murphy, 1998). The importance of these studies goes beyond the area of medical 1 Department of Industrial Engineering, Tel-Aviv University, Tel-Aviv 69978, Israel research and health care industry. For example, the life insurance companies have to analyze mortality data in determining the terms of their policies. Unfortunately, the tools used in the research on death causality have been so far limited to the standard statistical techniques, like summarization, regression, analysis of variance, etc. In the last decade, more advanced methods of finding patterns in data have become known under the general name of "data mining", and a formal framework for the entire proces
(Show Context)

Citation Context

...a. The methods implemented by us in the different stages of this project are based on two approaches to knowledge discovery, developed in our previous work: information-theoretic connectionist model (=-=Last and Maimon, 1999-=-; Maimon et al. 1999) and automated perceptions (Last and Kandel, 1999a). The information-theoretic model is used to find a minimum set of attributes associated with the cause of death and to represen...

Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University