20 citations found. Retrieving documents...
D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1):79--119, 1997. 17

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:
Bayesian Data Mining on the Web with B-Course - Myllymäki, Silander, Tirri..   (Correct)

....as a Bayesian network in Figure 1. An important property of Bayesian network models is that the joint probability distribution over the model variables factorizes to a product conditional probability distributions, one for each variable (see, e.g. the tutorial on Bayesian networks in data mining [6]) This subset of models is interesting, but it has its limitations too. More specifically, if the variables of our model are in causal relationships with each other, and if in our domain there are no latent variables (i.e. variables that for some reason are not included in our data) that have ....

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1):79--119, 1997.


B-Course: A Web Service for Bayesian Data Analysis - Myllymäki, Silander, Tirri..   (Correct)

....of probabilities of the form ijk = P (X i = x k j i = j ) where j denotes the jth value con guration of the parents i . In the sequel we will assume that the reader is familiar with the basics of Bayesian networks, and refer to the introductions textbooks in the literature (see, e.g. [12]) This subset of models is interesting, but it has its limitations too. More speci cally, if the variables of our model are in causal relationships with each other, and if in our domain there are no latent variables (i.e. variables that for some reason are not included in our data) that have ....

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1):79-119, 1997.


Bulletin of the Technical Committee on - Engineering March Vol   (Correct)

....on 42 C and X . However, this joint density is rarely known and very difficult to estimate. Hence, one has to resort to various techniques for estimating. These techniques include: 1. Density estimation, e.g. kernel density estimators [7, 22] or graphical representations of the joint density [14]. 2. Metric space based methods: define a distance measure on data points and guess the class value based on proximity to data points in the training set. An example is the K nearest neighbor method [7] 3. Projection into decision regions: divide the attribute space into decision regions and ....

.... some statement about the probability distribution governing the data) or they can be deterministic as in deriving functional dependencies between fields in the data [20] Density estimation methods in general fall under this category, as do methods for explicit causal modeling (e.g. 13] and [14]) Change and Deviation Detection These methods account for sequence information, be it time series or some other ordering (e.g. protein sequencing in genome mapping) The distinguishing feature of this class of methods is that ordering of observations is important. Scalable methods for finding ....

[Article contains additional citation context not shown here]

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1), 1997.


Data Mining in Schizophrenia Research - preliminary.. - Arnborg, Agartz, Hall, ..   (Correct)

....As distribution family we take piece wise constant functions, which translates to discretization of the variables. The prior distribution over the family is taken to be a Dirichlet distribution. Then the standard association tests of discrete distributions used e.g. in graphical model learning[5, 14] are applied. An empirical Bayes approach is used, where the granularity is chosen to give a su#cient number of points in each discretization level. Bayesian association models For a chosen discretization, a variable will be described as an occurrence vector (n i ) i=1 , where d is the number ....

David Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1:79--119, 1997.


Statistical Inference and Data Mining - Glymour, Madigan, al. (1996)   (8 citations)  (Correct)

....the previous paragraph [11] However, the statistical decisions made by the algorithms are not really optimal, and the implementations are limited to the multinomial and multinormal families of probability distributions. A review of Bayesian search procedures for causal models is given in [2]. Prediction Sometimes one is interested in using a sample, or a database, to predict properties of a new sample, where it is assumed the two samples are obtained from the same probability distribution. As with estimation, prediction is interested in accuracy and uncertainty, and is often ....

Heckerman, D. Bayesian networks for data mining. Data Mining and Knowledge Discovery, submitted.


Rule Mining with Prior Knowledge - A Belief Networks Approach - Zhou, Liu, Li, Chua (2001)   (Correct)

....to assign to the belief network with structure B s . Currently, the most common way for belief networks to handle continuous variables is to discretize them. Many discretization techniques can be found in [16] There are also many techniques to handle the variables with missing values [1, 7, 8, 11, 12, 25]. For example, in [25] an approach called Bound and Collapse is proposed to deal with the missing data. Assumption 3 is reasonable for many real world databases due to the way in which data is obtained. Assumption 4 is that for the structure B s , the prior distribution density for the conditional ....

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1), 1997.


Scalable Techniques for Mining Causal Structures - Silverstein, Brin, Motwani (1998)   (34 citations)  (Correct)

....research in statistics and Bayesian learning communities provide some avenues of attack. Two classes of technique have arisen: Bayesian causal discovery, which focuses on learning complete causal models for small data sets (Balke and Pearl, 1994, Cooper and Herskovits, 1992, Heckerman, 1995, Heckerman, 1997, Heckerman et al. 1994, Heckerman et al. 1997, Pearl, 1994, Pearl, 1995, Spirtes et al. 1993) and an offshoot of the Bayesian learning method called constraint based causal discovery, which use the data to limit sometimes severely the possible causal models (Cooper, 1997, Spirtes et ....

....learning communities provide some avenues of attack. Two classes of technique have arisen: Bayesian causal discovery, which focuses on learning complete causal models for small data sets (Balke and Pearl, 1994, Cooper and Herskovits, 1992, Heckerman, 1995, Heckerman, 1997, Heckerman et al. 1994, Heckerman et al. 1997, Pearl, 1994, Pearl, 1995, Spirtes et al. 1993) and an offshoot of the Bayesian learning method called constraint based causal discovery, which use the data to limit sometimes severely the possible causal models (Cooper, 1997, Spirtes et al. 1993, Pearl and Verma, 1991) While ....

[Article contains additional citation context not shown here]

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1997): 79--119.


A Supra-Classifier Framework For Knowledge Reuse - Bollacker (1998)   (Correct)

....very closely. However, in practice it may be quite difficult to use this type of knowledge to select and tune a proper model. Also, standard assumptions (independence among variables, Gaussian distributions, etc. that are used to make the problem tractable often result in a loss of accuracy [37, 48]. 1.2 Classifier Knowledge Reuse While it may seem at first that there is no other possible source of knowledge about a classification task than the training set and other, external (a priori) information, consider a previously constructed classifier that has the same input domain (uses the same ....

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1):79--120, 1997.


A survey of Bayesian Data Mining - Part I: Discrete and.. - Arnborg (1999)   (Correct)

.... which can solve some complex evaluations required in Bayesian modeling, can be found in the book[17] Books explaining theory and use of graphical models are Lauritzen[22] Cox and Wermuth[10] and Whittaker[35] A tutorial on Bayesian network approaches to data mining is found in (Heckermann[18]) This present report describes data mining in a relational data structure with discrete data (discrete data matrix) and the simplest generalizations to numerical data. A second part will describe general real valued data matrices, raster data representing, e.g. scalar and or vector elds, as ....

David Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1:79119, 1997.


Two-Stage Machine Learning Model for Guideline Development - Mani, Shankle, Dick, Pazzani (1998)   (1 citation)  (Correct)

....will include feature selection to reduce the attribute space by removing any irrelevant attributes, plus examination of item responses instead of just the total scores. Another approach worth pursuing is to capture the interactions among the attributes using a Bayesian network architecture [36]. 4.3 Has the Two Stage Machine Learning method identified a clinically usable model to stage dementia severity The answer is a qualified yes. We have compared the value of the ML methods reported here to the published data on dementia experts scoring the global and category CDRS scores for ....

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1:79-119, 1997.


A Case-Based Retrieval System for Diabetic Patients.. - Montani, Bellazzi..   (Correct)

....given a certain class. Although this approach makes such a strong assumption, it is known to be quite robust in a variety of situations [10, 11] Moreover, in the same Bayesian context it will be possible to improve the classification performances moving towards a Bayesian Network representation [12]. The probability that a case belongs to class c i given that the set of its features f = ff1 ; fM g is f may be calculated as: P (c i j f = f) M Y j=1 p(c i )p(f j = f j j c i ) The method classifies a case as belonging to the class that maximizes P (c i j f = f) The ....

D. Heckerman, 'Bayesian networks for data mining', Data Mining and Knowledge Discovery, 1, 79-119, (1997).


Scalable Techniques for Mining Causal Structures - Craig Silverstein (1998)   (34 citations)  (Correct)

....nor possible in most applications of data mining. Fortunately, recent research in statistics and Bayesian learning communities provide some avenues of attack. Two classes of technique have arisen: Bayesian causal discovery, which focuses on learning complete causal models for small data sets [BP94, CH92, H95, H97, HGC94, HMC97, P94, P95, SGS93], and an offshoot of the Bayesian learning method called constraint based causal discovery, which use the data to limit sometimes severely the possible causal models [C97, SGS93, PV91] While techniques in the first class are still not practical on very large data sets, a limited version ....

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1997): 79-- 119.


Scalable Techniques for Mining Causal Structures - Silverstein, Brin, Motwani.. (1998)   (34 citations)  (Correct)

....nor possible in most applications of data mining. Fortunately, recent research in statistics and Bayesian learning communities provide some avenues of attack. Two classes of technique have arisen: Bayesian causal discovery, which focuses on learning complete causal models for small data sets [8, 12, 14, 15, 16, 17, 21, 22, 23, 25, 26, 27], and an offshoot of the Bayesian learning method called constraint based causal discovery, which use the data to limit sometimes severely the possible causal models [11, 26, 24] While techniques in the first class are still not practical on very large data sets, a limited version of the ....

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1997): 79--119.


Mathematical Programming for Data Mining: Formulations.. - Bradley, Fayyad.. (1998)   (9 citations)  (Correct)

....corrupted by noise. In general, it should be noted that the problem of trading off the simplicity of a model with how well it fits the training data is a well studied problem. In statistics this is known as the bias variance tradeoff [54] in Bayesian inference it is known as penalized likelihood [13, 63], and in pattern recognition machine learning it manifests itself as the minimum message length (MML) 123] problem. The MML framework, also called minimum description length (MDL) 103] dictates that the best model for a given data set is one that minimizes the coding length of the data and the ....

....the joint density on Y and X . However, this joint density is rarely known and difficult to estimate. Hence one has to resort to various techniques for estimating this density, including: 1. Density estimation, e.g. kernel density estimators [41] or graphical representations of the joint density [63]. 2. Metric space based methods: define a distance measure on data points and guess the class value based on proximity to data points in the training set. For example, the K nearest neighbor method [41] 3. Projection into decision regions: divide the attribute space into decision regions and ....

[Article contains additional citation context not shown here]

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1), 1997.


Bayesian Network Classifiers. - Friedman, Geiger, Goldszmidt (1997)   (124 citations)  Self-citation (Classifiers)   (Correct)

....relied only on five attributes for the class prediction. We base our definition of relevant attributes on the notion of a Markov blanket. The Markov blanket of a variable X consists of X s parents, X s children, and the parents of X s children in a given network structure G (Pearl, 1988) This BAYESIAN NETWORK CLASSIFIERS 9 0 5 10 15 20 25 30 35 40 45 22 19 10 25 16 11 9 4 6 18 17 2 13 1 15 14 12 21 7 20 23 8 24 3 5 Classification Error ( Dataset BN NB 0 5 10 15 20 25 30 35 40 45 0 5 10 15 20 25 30 35 40 45 NB error ( BN error ( Figure 2. Error curves (top) and scatter plot (bottom) comparing ....

....of Theta, weighted by their posterior probability. Thus, Pr(X = v i jD) R Pr(X i = v i j Theta)P r( Theta j D)d Theta. For a particular family of priors, called Dirichlet priors, there is a known closedform solution for integral. A Dirichlet prior is specified by two hyperparameters. BAYESIAN NETWORK CLASSIFIERS 15 0 5 10 15 20 25 30 35 40 45 1 10 9 11 4 6 2 13 18 25 14 22 7 21 12 8 24 20 17 3 16 19 5 15 23 Classification Error ( Dataset TAN NB 0 5 10 15 20 25 30 35 40 45 0 5 10 15 20 25 30 35 40 45 NB error ( TAN error ( Figure 4. Error curves comparing smoothed TAN (solid, x axis) with naive ....

[Article contains additional citation context not shown here]

BAYESIAN NETWORK CLASSIFIERS 35 Friedman, J. (1997a). On bias, variance, 0/1 - loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery 1. in press.


Probabilistic User Behavior Models - Eren Manavoglu Department   (Correct)

No context found.

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1):79--119, 1997. 17


Interactive Analysis of Gene Interactions Using Graphical.. - Wu, Ye, Subramanian   (Correct)

No context found.

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1:79--119, 1997.


A Hybrid Anytime Algorithm for the Construction of Causal.. - Dash, Druzdzel (1999)   (2 citations)  (Correct)

No context found.

David Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1):79--119, 1998.


Interactive Analysis of Gene Interactions Using Graphical.. - Wu, Ye, Subramanian   (Correct)

No context found.

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1:79--119, 1997.


B-Course: A Web-Based Tool For Bayesian And Causal.. - Myllymäki, Silander.. (2002)   (Correct)

No context found.

D. Heckerman. Bayesian networks for data mining. Data Mining and Knowledge Discovery, 1(1):79--119, 1997.

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC