9 citations found. Retrieving documents...
C. Glymour, D. Madigan, D. Preigbon, and P. Smyth. Statistical Inference and Data Mining. Journal of the CACM, 39(11):35--41, 1996.

 Home/Search   Document Details and Download   Summary   ACM   TOC   Related Articles   Check  

This paper is cited in the following contexts:
Statistics and Data Mining: Intersecting Disciplines - Hand (1999)   (2 citations)  (Correct)

....dealt with in the proceedings of the International Conference on Knowledge Discovery and Data Mining series (the two most recent proceedings being [12] and [1] and the journal Data Mining and Knowledge Discovery. Papers discussing the relationship between statistics and data analysis include [8], 4] and [10] 5. ....

Glymour C., Madigan D., Pregibon D., and Smyth P. (1996) Statistical inference and data mining. Communications of the ACM, 39, 35-41.


Automatic Aggregation using Explicit Metadata - Grumbach, Tininini (2000)   (Correct)

....(OLAP) GBLP96, HRU96, LS97, Sho97, CD97] The real challenge of this sort of data is caused by the rather intricate semantics of summary values, that is not handled by classical database systems. The relationships between OLAP and data mining have also been investigated by several authors [GMPS96, Han97, Han98]. The research on statistical databases has been mainly concerned with conceptual modelization. The focus is on the macro data obtained by grouping and aggregating the original micro data. In SDBs, it is generally assumed that the micro data are not available, both for efficiency reasons (the ....

C. Glymour, D. Madigan, D. Pregibon, and P. Smyth. Statistical inference and data mining. Communications of the ACM, 39(11):35-- 41, 1996.


Clustering for Edge-Cost Minimization - Schulman   (14 citations)  (Correct)

....= 0 8u 2 T ; hence equivalently OE(S) 1 2 P u;v2S wuw v OE u;v : We will also assume throughout that OE is nonnegative. For general references in the field of clustering see [10, 38, 33, 69, 27, 36, 53, 54, 2, 7] for discussions of a variety of interesting methods and application areas see [24, 68, 65, 58, 60, 67, 48, 26, 46]. A key role in our method is played by a random sampling process which, given T , picks a very small weighted collection of points. We show that for a range of cost functions, the cost of this collection is with high probability close to that of the original collection T . Moreover in the case OE ....

C. Glymour, D. Madigan, D. Pregibon, and P. Smyth. Statistical inference and data mining. Communications of the ACM, 39(11), November 1996.


Formal Logics of Discovery and Hypothesis Formation By Machine - Hájek, Holena   (Correct)

....in particular statistical methods, for the evaluation of hypotheses in that space. Moreover, that similarity goes even further, covering also the main kinds of statistical methods employed for the evaluation, namely statistical hypotheses testing, most often in the context of contingency tables [5, 9, 31, 32, 55, 56]. Scope. GUHA relates, in particular, to mining association rules. Indeed, if A = fA 1 ; Am g is the set of binary attributes in a database of size k, and if X;Y ae A; X Y = then the association rule X ) Y is significant in the database (according to [1, 2, 27, 28, 40, 47, 54] if ....

Glymour, C., Madigan, D., Pregibon, D., and Smyth, P. Statistical inference and data mining. Communications of the ACM 39 (1996), 35--41.


An Evalution of Data Mining Methods and Tools - Lidal, Dingsøyr   (Correct)

....missing, biased or not applicable Is the system able to reason with noisy data, or must the data be cleaned 2.2. 2 Consistency Does the system discover inconsistency Is it able to reason with inconsistency Is the system trying to hide uncertainty or is it actively using and revealing it [4] Will the system discover latent attributes 2.2.3 Prior Knowledge Do we have any prior knowledge of the system that is to be analyzed Is it in the form of a data dictionary or domain knowledge Is it extracted automatically Is the method using metadata to eliminate relations between data that ....

Clark Glymour et al. Statistical inference and data mining. Communications of the ACM, 39(11), 1996.


Data Mining At The Interface Of Computer Science And Statistics - Smyth (2001)   (1 citation)  Self-citation (Smyth)   (Correct)

....a bank for accounting purposes) Thus, issues such as experimental design (the construction of an experiment to collect data to test a specific hypothesis) are not typically within the vocabulary or tool set of a data miner. For other general discussions on statistical aspects of data mining see [EP96, GMPS96, GMPS97, HPS97, Han98, Lam00, Smy00]. 3. A Reductionist View of Data Mining Let us consider a very high level view of data mining and try to reduce a generic data mining algorithm into its component parts. The particular reductionist viewpoint proposed here is not necessarily unique, but it nonetheless does provide some insight ....

Glymour C., Madigan D., Pregibon D., Smyth P. (1996) Statistical inference and data mining, Communications of the ACM, 39(11), 35--41.


Minimum Message Length Inference: Theory and Applications - Baxter (1996)   (2 citations)  (Correct)

No context found.

C. Glymour, D. Madigan, D. Preigbon, and P. Smyth. Statistical Inference and Data Mining. Journal of the CACM, 39(11):35--41, 1996.


Managing Uncertainty and Quality in the Classification Process - Halkidi, Vazirgiannis   (Correct)

No context found.

Glymour C., MadiganD., Pregibon D, Smyth P, "Statistical Inference and Data Mining", in CACM v39 (11), 1996, pp. 35-42


Parallel and Distributed Computing for Data Mining - Zomaya, al. (1999)   (1 citation)  (Correct)

No context found.

C. Glymour et al., "Statistical Inference and Data Mining," Comm. ACM, Vol. 39, No. 11, Nov. 1996, pp. 35--41.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC