| R. Hoschka and W. Kl osgen, A support system for interpreting statistical data, in: Knowledge Discovery in Databases, 1991, pp. 325--345. |
....and they nd regularitiesrelated tothis concept but leave any other potentially interesting phenomena unfound. The advantage of those systems is that the patterns they nd are more expressive than the relatively simple rules that we use. Something can also be said of various KDD systems. Explora [14, 15] nds interesting instances of statistical patterns. The patterns discovered by 49er [16] are contingency tables, equations, and logical equivalences. The Key Finding Reporter (Ke r) 17, 18] tailored with a lot of domain knowledge, discovers and explains deviations, and gives recommendations for ....
....related rules. While creating a focus, simple threshold like restrictions, such as rule fre quency and con dence may satisfy a large number of rules. In our approach, this problem can be alleviated by selecting rules to or removing rules from the view by templates [32] Hoschka and Kl osgen [14] have also used templates for de ning interesting knowledge, and their ideas have strongly in uenced our work. Their approach is based on few xed statement types and partial ordering of attributes, whereas our approach is closer to regular expressions. We de ne templates as simple regular ....
P. Hoschka and W. Kl osgen, A support system for interpreting statistical data, In G. Piatetsky - Shapiro andW.J. Frawley (eds.), Knowledge Discovery inDatabases ,AAAI Press, Menlo Park, California pp. 325345, 1991.
....Guillet Philipp 1998] discrimination [Gray Orlowska 1998] Kamber and Shinghal [1996] propose specific measures of rule interestingness for characteristic rules based on necessity and sufficiency. Specifically with respect to the issue of redundancy not so much work has been carried out. Hoschka and Klsgen [1991] deal with the problem of redundancy in their Explora system. It uses partial orderings of attributes and attribute sets to avoid presenting several kinds of redundant knowledge. Bayardo [1997] proposes a pruning strategy called redundancy exploitation. The idea is to prevent continued effort at ....
Hoschka R., and Klsgen W. (1991). A Support System for Interpreting Statistical Data, in Knowledge Discovery in Databases, 325-345.
....is considered 1. Their algorithm would find and report this. Finally, there is the problem of displaying the large volume of results. For example, on the Mushroom data set they found 299811 borders, each representing about 2 18 sets. This is far too many results to show to an end user. Explora (Hoschka Klosgen, 1991; Klosgen, 1996) searches for subgroups of cases with unusual distributions of a target variable with respect to a parent population. For example, the target variable could be the mean salary which is larger for the subgroups gender = male, education 15 years, and race = white (Klosgen, 1993) ....
Hoschka, P., & Klosgen, W. (1991). A support system for interpreting statistical data. In G. Piatetsky-Shapiro and W. J. Frawley (Eds.), Knowledge discovery in databases, 325--346. AAAI Press.
....learns a Bayesian network whose shape is a tree. Kutat[7] demonstrated further that more complex structure learning was possible from quite reasonable sample sizes. Other early work on structure learning was often based on identification methods and the reader is directed towards [4,5,8,9] and to [34,35,36,37] for systems that extract knowledge from databases. In this paper we present two models that represent methodologies of semi automated expert system construction. Using a domain specific database describing AAP both models extract the structure and corresponding branch parameters, directly from ....
Hoschka, P Klosgen, W: A support System for Interpreting Statistical Data, in Knowledge Discovery in Databases, (1991), Cambridge, MA: AAAI/MIT, pp. 325 - 345
....and Philipp, 1998) discrimination (Gray and Orlowska, 1998) Kamber and Shinghal (1996) propose specific measures of rule interestingness for characteristic rules based on necessity and sufficiency. Specifically with respect to the issue of redundancy not so much work has been carried out. Hoschka and Klsgen (1991) deal with the problem of redundancy in their Explora system. It uses partial orderings of attributes and attribute sets to avoid presenting several kinds of redundant knowledge. Bayardo (1997) proposes a pruning strategy called redundancy exploitation. The idea is to prevent continued effort at ....
Hoschka R., and Klsgen W. (1991). A Support System for Interpreting Statistical Data, in Knowledge Discovery in Databases, 325-345.
....similar to powerful statistical data exploration systems, while others essentially seek to put the capacity for extracting knowledge from data in the hands of the domain expert rather than a professional data analyst. These must thus use domain terminology. An example of general tool is Explora (Hoschka and Kl osgen 1991; Kl osgen 1996) and examples of more domain specific tools are the Interactive Data Exploration and Analysis system of AT T (Selfridge, Srivastava, and Wilson 1996) which permits one to segment market data and analyze the e#ect of new promotions and advertisements, and Advanced Scout (Bhandari ....
Hoschka, P., and Kl osgen, W. (1991), "A Support System for Interpreting Statistical Data," in Knowledge Discovery in Databases, eds. G.
....They use a Minimum Description Length approach where surprising patterns are those with long encoding costs. Our work is fundamentally different. We find differences between two or more probability distributions, whereas they find changes in a single distribution as it varies through time. Explora [8] searches for subgroups of cases with unusual distributions with respect to a target variable (T ) and the parent population: i.e. it finds a subpopulation G s ae G p such that P(T j G s ) 6= P(T j G p ) In contrast, our goal is, given the groups G 1 and G 2 , to find conjunctions of variables ....
P. Hoschka and W. Klosgen. A support system for interpreting statistical data. In G. Piatetsky-Shapiro and W. J. Frawley, editors, Knowledge Discovery in Databases, pages 325--346. AAAI Press, 1991.
....We begin with a review of databases, identifying some of their characteristics that make automated discovery challenging. We then propose a model of an idealized KDD system and outline its essential components. Finally, we compare three KDD systems: CoverStory tm [Schmitz et al. 1990] EXPLORA [Hoschka and Klosgen, 1991], and the Knowledge Discovery Workbench [Piatetsky Shapiro and Matheus, 1991] 2. Database Issues A database is an integrated collection of data maintained in one or more files, organized to facilitate the efficient storage, modification, and retrieval of related information [Date, 1977] ....
....analysis on samples of data by selecting specific fields and or subsets of records. Because databases usually contain some fields that are redundant, irrelevant, or unimportant to a given discovery task, focusing on a subset of fields is now common practice [Piatetsky Shapiro and Matheus, 1991, Hoschka and Klosgen, 1991] Focusing further on a subset of records, which becomes necessary with larger databases, is achievable by random sampling methods, or by using selection constraints to limit attention to subclasses of records, e.g. selecting the top ten percent of customer records based on spending. Section 3.4 ....
[Article contains additional citation context not shown here]
P. Hoschka and W. Klosgen. A support system for interpreting statistical data. In G. Piatetsky-Shapiro and W. Frawley, editors, Knowledge Discovery in Databases, chapter 19, pages 325--345. AAAI/MIT Press, Cambridge, MA, 1991.
....attributes, esp. for hierarchically ordered values, ffl faster heuristic methods, e.g. for clustering, ffl more comprehensible results. Following the two orthogonal categorizations above the algorithms in Figure 1 can be classified as follows: Algorithm Output Input Reference Explora 1 a [HK91] RDT 1 b [KW92] Claudien 1 b [RB93] C4.5 2 a [Qui93] Ripper 2 a [Coh95] Cilgg 2 b [Kie96] Foil 2 b [QCJ93] Cobweb 3 a [Fis87] All the mentioned mining algorithms are based on pure main memory processing. The main argument for disk based data mining algorithms (e.g. MAR96] is the assumption that ....
P. Hoschka and W. Klosgen. A support system for interpreting statistical data. In PiatetskyShapiro and Frawley, editors, Knowledge Discovery in Databases. 1991.
....discovery algorithm to apply and how to evaluate and interpret the result . KDW is ideal for exploratory data analysis by a user knowledgeable in both data and operation of discovery tools . However such heavy reliance on the user has given the system a low ranking on the autonomy scale. Explora [4, 69] is another KDD system that incorporates a variety of search strategies to adapt discovery processes to the requirements of applications. It operates by performing a graph search through a network of patterns, searching for instances of interesting patterns. Interestingness is evaluated locally by ....
P. Hoschka and W. Klosgen, "A support system for interpreting statistical data," in Knowledge Discovery in Databases (G. Piatetsky-Shapiro and W. J. Frawley, eds.), pp. 325--345, Cambridge, MA: AAAI/MIT, 1991.
....With the exception of these rather general studies, the KDD research has pretty much concentrated on the pattern discovery phase of the process. Something can be said, however, of the processes supported by various KDD systems, none of which supports the methodology proposed here. Explora [11, 13] finds interesting instances of statistical patterns. In Explora, the pattern discovery phase is focused by the user. The system selects and presents the best patterns to the user, and, based on the results, the user can change the focus and repeat the pattern discovery. The patterns discovered by ....
Peter Hoschka and Willi Klosgen. A support system for interpreting statistical data. In Gregory Piatetsky-Shapiro and William J. Frawley, editors, Knowledge Discovery in Databases, pages 325 -- 345. AAAI Press, Menlo Park, CA, 1991.
....discovery algorithm to apply and how to evaluate and interpret the result . KDW is ideal for exploratory data analysis by a user knowledgeable in both data and operation of discovery tools . However such heavy reliance on the user has given the system a low ranking on the autonomy scale. Explora [4, 71] is another KDD system that incorporates a variety of search strategies to adapt discovery processes to the requirements of applications. It operates by performing a graph search through a network of patterns, searching for instances of interesting patterns. Interestingness is evaluated locally by ....
P. Hoschka and W. Klosgen, "A support system for interpreting statistical data," in Knowledge Discovery in Databases (G. Piatetsky-Shapiro and W. J. Frawley, eds.), pp. 325--345, Cambridge, MA: AAAI/MIT, 1991.
....strength and (statistical) significance abound, it is much harder to know which of the discovered rules really interest the user. Of course, this problem is quite hard. The issue of interestingness of discovered knowledge has been discussed in general by Piatetsky Shapiro [12] Hoschka and Klosgen [6] have also used templates for defining interesting knowledge, and their ideas have strongly influenced our work. Their approach is based on few fixed statement types and partial ordering of attributes, whereas our approach is closer to regular expressions. Han et al. 5] present an ....
....The problem we consider in this paper is essentially how to find the most interesting rules. We suggest that this is done by giving the user a possibility to specify classes of both interesting and uninteresting rules. Hoschka and Klosgen deal with the problem of redundancy in their Explora system [6]. It uses partial orderings of attributes and attribute sets to avoid presenting several types of redundant knowledge. However, the two parameters of association rules confidence and support make it more difficult to define sensible limits and semantics for redundancy. We also consider the ....
Peter Hoschka and Willi Klosgen. A support system for interpreting statistical data. In Gregory PiatetskyShapiro and William J. Frawley, editors, Knowledge Discovery in Databases, pages 325 -- 345. AAAI Press / The MIT Press, Menlo Park, CA, 1991.
.... Data Mining algorithms are attribute based, Inductive Logic Programming approaches fall into the second category of relational approaches (Kietz 1996) Following the two orthogonal categorizations above the algorithms in Figure 1 can be classified as follows: Tool O I Reference SIDOS 1 a (Hoschka Klosgen 1991) Explora MIDOS 1 b (Wrobel 1997) Claudien 1 b (De Raedt Bruynooghe 1993) C4.5 2 a (Quinlan 1993) Ripper 2 a (Cohen 1995) Cilgg 2 b (Kietz 1996) Foil 2 b (Quinlan Cameron Jones 1993) Cobweb 3 a (Fisher 1987) Table content TableName.KeyName lifeinsurance policy vvert.vvid ....
Hoschka, P., and Klosgen, W. 1991. A support system for interpreting statistical data. In Piatetsky-Shapiro, and Frawley., eds., Knowledge Discovery in Databases.
No context found.
R. Hoschka and W. Kl osgen, A support system for interpreting statistical data, in: Knowledge Discovery in Databases, 1991, pp. 325--345.
No context found.
P. Hoschka, W. Klo sgen, A support system for interpreting statistical data, in: G. Piatetsky-Shapiro, W.J. Frawley (Eds.), Knowledge Discovery in Databases, AAAI Press, Menlo Park, CA, 1991, pp. 325.
No context found.
P. Hoschka and W. Klosgen. A support system for interpreting statistical data. In Gregory Piatetsky-Shapiro and William J. Frawley, editors, Knowledge Discovery in Databases, pages 325 -- 345. AAAI Press, Menlo Park, CA, 1991.
No context found.
Peter Hoschka and Willi Klosgen, 1991. A support system for interpreting statistical data. In Gregory Piatetsky-Shapiro and William J. Frawley, editors, Knowledge Discovery in Databases. MIT Press.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC