#### DMCA

## Applying General Bayesian Techniques to Improve TAN Induction (1999)

### Cached

### Download Links

- [wai.maia.ub.es]
- [www.ubilab.org]
- [www.ubilab.org]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proceedings of the International Conference on Knowledge Discovery and Data Mining |

Citations: | 5 - 3 self |

### Citations

3469 |
UCI repository of machine learning databases
- Blake, Merz
- 1998
(Show Context)
Citation Context ...mber of observations increases, a good λ adjustment can seriously improve the quality of the prediction. 5.2 Experimental setting We tested five algorithms over 14 datasets from the Irvine repository =-=[2]-=- plus our own credit screening database. The dataset characteristics are described in Table 1. To discretize continuous attributes we tried maximum entropy [7] discretization and equal frequency discr... |

2214 | On information and sufficiency - Kullback, Leibler - 1951 |

1567 | Wrappers for feature subset selection
- Kohavi, John
- 1997
(Show Context)
Citation Context ...n these facts, the general idea is that if we somehow relax the assumptions that are made and keep the “way of reasoning”, we can get a more accurate classifier. This has been tried in different ways =-=[9, 13, 14, 15, 18, 21]-=-. From our point of view TAN are the more coherent and best performing enhancement to Naive Bayes up to now. In this section we discuss the TAN induction algorithm presented at [9]. After that we appl... |

832 | Multi-interval discretization of continuous-valued attributes for classification learning
- Fayyad, Irani
- 1993
(Show Context)
Citation Context ...r 14 datasets from the Irvine repository [2] plus our own credit screening database. The dataset characteristics are described in Table 1. To discretize continuous attributes we tried maximum entropy =-=[7]-=- discretization and equal frequency discretization with 5 intervals. We present the results for equal frequency because it provided better accuracy. For each dataset and algorithm we tested both accur... |

818 | On the optimality of the simple Bayesian classifier under zero-one loss.
- Domingos, Pazzani
- 1997
(Show Context)
Citation Context ... to learning bayesian networks can be applied to TAN induction. 3 Tree Augmented Naive Bayes Tree Augmented Naive Bayes (TAN) appears as a natural extension to the Naive Bayes classifier. Naive Bayes =-=[16, 19, 6]-=- is a very simple classifier that performs very well on small and not-so-small datasets. The assumption made by Naive Bayes is that all the attributes in the dataset are conditionally independent give... |

796 | Bayesian network classifiers.
- Friedman, Geiger, et al.
- 1997
(Show Context)
Citation Context ...Ubilab UBS AG Bahnhofstrasse 45 P.O. Box, CH-8098 Zürich Jesus.Cerquides@ubs.com Abstract Tree Augmented Naive Bayes (TAN) has shown to be competitive with stateof-the-art machine learning algorithms =-=[9]-=-. However, the TAN induction algorithm that appears in [9] can be improved in several ways. In this paper we identify three weak points in it and introduce two ideas to overcome those problems: the mu... |

144 | Learning Classification Trees,
- Buntine
- 1992
(Show Context)
Citation Context ... In [16] BMA is applied to Naive Bayes, and it is shown that it improves both classification 6saccuracy and the quality of the probability estimates. In [1, 5], it is applied to rule induction and in =-=[3]-=- to decision tree induction, in both cases leading to good results. 4.2 Local Bayesian Model Averaging In practice, the usage of BMA presents some problems, coming from: • The computational cost of ca... |

129 |
The cognition of inductive methods.
- Carnap
- 1952
(Show Context)
Citation Context ...on for P ∗ S (i.e. the posterior probability after having seen the sample S). P ∗ S(x1, . . . , xn) = CountS(x1, . . . , xn) + N + λ λ States(P ∗ ) For readers familiar with the work of Rudolf Carnap =-=[4]-=-, what we have done is just setting a Carnapian system as it is described in [8] In the next section we will see how this approach to learning bayesian networks can be applied to TAN induction. 3 Tree... |

123 |
Semi-naive bayesian classifier,
- Kononenko
- 1991
(Show Context)
Citation Context ...n these facts, the general idea is that if we somehow relax the assumptions that are made and keep the “way of reasoning”, we can get a more accurate classifier. This has been tried in different ways =-=[9, 13, 14, 15, 18, 21]-=-. From our point of view TAN are the more coherent and best performing enhancement to Naive Bayes up to now. In this section we discuss the TAN induction algorithm presented at [9]. After that we appl... |

71 | Learning augmented bayesian classifiers: A comparison of distribution-based and classification-based approaches.
- Keogh, Pazzani
- 1999
(Show Context)
Citation Context ...n these facts, the general idea is that if we somehow relax the assumptions that are made and keep the “way of reasoning”, we can get a more accurate classifier. This has been tried in different ways =-=[9, 13, 14, 15, 18, 21]-=-. From our point of view TAN are the more coherent and best performing enhancement to Naive Bayes up to now. In this section we discuss the TAN induction algorithm presented at [9]. After that we appl... |

66 | BMA: Bayesian model averaging
- Raftery, Hoeting, et al.
- 2005
(Show Context)
Citation Context ...explained in Section 5. 4 Local Bayesian Model Averaging The second weak point in the TAN induction algorithm of [9] is that they ignore uncertainty in model selection. Bayesian model averaging (BMA) =-=[12]-=- provides a coherent 5 (7) (8)sprocedure Learn-TAN (Dataset D) var ProbabilityDistribution P ∗ S DirectedGraph T AN; begin Calculate P ∗ S by using Equation 5 T AN = Construct-TAN(P ∗ S ) Set the weig... |

52 | Model selection and accounting for model uncertainty in linear regression models - Raftery, Madigan, et al. - 1997 |

38 | Efficient bayesian parameter estimation in large discrete domains
- Friedman, Singer
- 1999
(Show Context)
Citation Context ... approach estimation. There are some other interesting choices for priors. Concretely it will be interesting to see how a prior designed to deal with a large number of states, as the one developed in =-=[10]-=-, affects the performance of the algorithm. It should also be studied how the different values of λ affect the performance and whether one can develop methods to adjust λ automatically. 7 Acknowledgem... |

33 |
Two algorithms for generating weighted spanning trees in order
- Gabow
- 1977
(Show Context)
Citation Context ...ures will be given by the algorithm Construct-TAN, just modifying the step where a maximum spanning tree is induced, to generate a set containing the K maximum spanning trees by using Gabow algorithm =-=[11]-=-. In order to calculate P ′ (M), we set a prior over tree structures that assigns the same probability to each possible tree structure (since they can be considered of a similar complexity). We also h... |

9 | Bayes Optimal Instance-Based Learning.
- Kontkanen, Myllymaki, et al.
- 1998
(Show Context)
Citation Context ... to learning bayesian networks can be applied to TAN induction. 3 Tree Augmented Naive Bayes Tree Augmented Naive Bayes (TAN) appears as a natural extension to the Naive Bayes classifier. Naive Bayes =-=[16, 19, 6]-=- is a very simple classifier that performs very well on small and not-so-small datasets. The assumption made by Naive Bayes is that all the attributes in the dataset are conditionally independent give... |

6 |
Learning multiple relational rule-based models
- Ali, Pazzani
- 1995
(Show Context)
Citation Context ...m or another in the machine learning community. In [16] BMA is applied to Naive Bayes, and it is shown that it improves both classification 6saccuracy and the quality of the probability estimates. In =-=[1, 5]-=-, it is applied to rule induction and in [3] to decision tree induction, in both cases leading to good results. 4.2 Local Bayesian Model Averaging In practice, the usage of BMA presents some problems,... |

3 |
Optimum Inductive Methods. A Study in Inductive Probability Theory, Bayesian Statistics and Verisimilitude
- Festa
- 1993
(Show Context)
Citation Context ... P ∗ S and will allow us to predict for unseen examples. We need a way to calculate P ∗ S . There is a lot of literature dedicated to the problem of multinomial sampling. A good reference for this is =-=[8]-=-. For the purposes of this paper we adhere to the principle of indifference, which says that if we lack better information, we should assign an equal probability to each possible success. Our prior is... |

1 | Bayesian model averaging in rule induction
- Domingos
- 1997
(Show Context)
Citation Context ...m or another in the machine learning community. In [16] BMA is applied to Naive Bayes, and it is shown that it improves both classification 6saccuracy and the quality of the probability estimates. In =-=[1, 5]-=-, it is applied to rule induction and in [3] to decision tree induction, in both cases leading to good results. 4.2 Local Bayesian Model Averaging In practice, the usage of BMA presents some problems,... |