27 citations found. Retrieving documents...
Langley, P. (1993). Induction of recursive bayesian classifiers. Proceedings of the European Conference on Machine Learning. Lecture Notes in Computer Science, 667:153--164.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Finite Mixture Model of Bounded Semi-Naive Bayesian Networks .. - Huang, King, Lyu   (Correct)

....WORKS Since the invention of NB, many BN classifiers have been developed, including restricted types and unrestricted types. Among the restricted BN classifiers are Semi Naive Bayesian networks classifiers [17] 25] Selective Naive Bayesian network classifiers [21] Recursive Bayesian classifier [19], Tree Augmented Naive Bayesian networks classifiers [11] Limited Bayesian network classifiers [30] and Adjusted probability Naive Bayesian networks classifiers [37] And for inducing unrestricted Bayesian network classifiers, K2 [5] is a popular algorithm. Since our focus is in the restricted ....

....Classifier to eliminate dependent attributes from NB [21] Different with Pazzani s approach, they threw away one of the two dependent attributes while Pazzani joined these two dependent attributes. This model s performance is shown to be slightly worse than the Pazzani s approach. Langley [19] proposed a Recursive Bayesian classifiers to adapt NB into some non linearly separable problems. However it did not provide a significant benefit on naturally occurring databases [25] Friedman et al. 11] developed so called Tree Augmented Naive Bayesian (TAN) network classifier which ....

P. Langley. Induction of recursive bayesian classifiers. In Proceedings of the 1993.


Lazy Learning of Bayesian Rules - Zheng, Webb   (5 citations)  (Correct)

.... 1991) attribute deletion (Langley Sage, 1994) the constructive Bayesian classifier (Bsej) Pazzani, 1996) probability adjustment (Webb Pazzani, 1998) Bayesian networks (Friedman Goldszmidt, 1996; Sahami, 1996; Singh Provan, 1995; 1996) the recursive Bayesian classifier (Rbc) (Langley, 1993), and the naive Bayesian tree learner (NBTree) Kohavi, 1996) These studies have shown that it is possible to improve upon the general error performance of the naive Bayesian classifier, although Domingos and Pazzani (1996) argue that the naive Bayesian classifier is still in fact optimal when ....

....without attribute independencies. Singh and Provan (1995; 1996) investigate the forward sequential attribute subset selection method for learning Bayesian networks and compare it with conventional Bayesian networks, the naive Bayesian classifier, and the selective naive Bayesian classifier. Rbc (Langley, 1993) alleviates the attribute inter dependence problem of naive Bayesian classification by identifying regions of the instance space in which the independence assumption holds. It recursively splits the instance space into subspaces using a tree structure. Each internal node of the tree is a naive ....

[Article contains additional citation context not shown here]

Langley, P. (1993). Induction of recursive Bayesian classifiers. Proceedings of the European Conference on Machine Learning (pp. 153-164). Berlin: Springer-Verlag.


On the Optimality of the Simple Bayesian Classifier under.. - Pazzani (1997)   (107 citations)  (Correct)

....is substantially less accurate than decision tree learners. The simple Bayesian classifier is limited in expressiveness in that it can only create linear frontiers (Duda Hart, 1973) Therefore, even with many training examples and no noise, it does not approach 100 accuracy on some problems. Langley (1993) proposed the use of recursive Bayesian classifiers to address this limitation. In his approach, the instance space is recursively divided into subregions by a hierarchical clustering process, and a Bayesian classifier is induced for each region. Although the algorithm worked on an artificial ....

Langley, P. (1993). Induction of recursive Bayesian classifiers. Proceedings of the Eighth European Conference on Machine Learning (pp. 153--164). Vienna, Austria: Springer-Verlag.


Cascade Generalization - Gama, Brazdil (2000)   (4 citations)  (Correct)

....27 is that it provides a single framework, for a collection of different methods. Our method can be related to several paradigms of machine learning. For example there are similarities with multivariate trees (Brodley Utgof, 1995) neural networks (Fahlman Lebiere, 1990) recursive Bayes (Langley, 1993), and multiple models, namely Stacked Generalization (Wolpert, 1992) In our previous work (Gama Brazdil, 1999) we have presented system Ltree that combines a decision tree with a discriminant function by means of constructive induction. Local Cascade combinations extend this work. In Ltree the ....

P. Langley. Induction of recursive bayesian classifiers. In P.Brazdil, editor, Machine Learning: ECML-93. LNAI 667, Springer Verlag, 1993.


Local Cascade Generalization - Gama (1998)   (3 citations)  (Correct)

.... trees [5, 15] Any multivariate tree is topologically equivalent to a three layer inference network [18] The constructive ability of our system is similar to the Cascade Correlation Learning architecture [11] Also the final model of CGBtree is related with the recursive naive Bayes presented in [17]. In a previous work [13] we have compared system Ltree, similar to CGLtree, with Oc1 [19] and LMDT [5] The focus of this paper is on methodologies for combining classifiers. As such, we review other methods that generate and combine multiple models. 4.1 Combining Classifications We can ....

....training set: k = max(1; 2 log(nr: different values) This heuristic was used in [10] and elsewhere with good overall results. Missing values were treated as another possible value for the attribute. In order to classify a query point, a naive Bayes uses all of the available attributes. Langley [17] refers that naive Bayes relies on an important assumption that the variability of the dataset can be summarized by a single probabilistic description, and that these are sufficient to distinguish between classes. From an analysis of BiasVariance, this implies that naive Bayes uses a reduced set ....

P. Langley. Induction of recursive bayesian classifiers. In P.Brazdil, editor, Machine Learning: ECML-93. LNAI 667, Springer Verlag, 1993.


Efficient Learning of Selective Bayesian Network Classifiers - Singh, Provan (1995)   (24 citations)  (Correct)

....database. 26 3 Learning curves for the Soybean database. 27 4 Learning curves for the Voting database. 28 ii 1 Introduction One of the simplest Bayesian induction methods is the naive Bayesian classifier [8, 9]. Despite its simplicity and the strong assumption that the attributes are conditionally independent given the class variable, the naive Bayesian classifier has been shown to perform remarkably well in some domains. The simplest naive classifier is one which consists of all attributes ....

....probabilities P (c i ) and P (v j jc i ) We refer to this naive Bayesian classifier (that models all attributes) as naive ALL. Although, its performance is remarkably good given its simplicity, it is typically limited to learning classes that can be separated by a single decision boundary [9], and in domains in which the attributes are correlated given the class variable, its performance can be worse than other approaches which can account for such correlations. Bayesian networks can account for correlations among attributes, so they are a natural extension of the naive approach. The ....

P. Langley. Induction of Recursive Bayesian Classifiers In Proc. European Conf. on Machine Learning, pages 153--164. Springer Verlag, 1993.


Learning Non-Linearly Separable Boolean Functions With Linear.. - Mehran Sahami   (7 citations)  (Correct)

.... irrelevant attribute elimination (Brodley Utgoff 1992) producing several hyperplanes at each node using different weight updating procedures and selecting the hyperplane which causes the fewest number of incorrect classifications, using Bayesian analysis to determine instance separations (Langley 1992), post processing of the tree to reduce its size, etc. These modifications are beyond the scope of this paper however, and generally are only fine tunings to the underlying learning architecture which is not changed by them. Creating Networks From LTU Trees The trees which are produced by the ....

Langley, P. 1992. Induction of Recursive Bayesian Classifiers. Forthcoming.


Naive Bayes for Regression - Frank, Trigg, Holmes, Witten (1998)   (6 citations)  (Correct)

....is not that there are no attribute dependences in the data. Despite this, several researchers have tried to improve naive Bayes by deleting redundant attributes (Langley Sage, 1994; John Kohavi, 1997) or by extending it to incorporate simple high order dependencies (Kononenko, 1991; Langley, 1993; Pazzani, 1996; Sahami, 1996; Friedman, Geiger Goldszmidt, 1997) Domingos and Pazzani (1997) review these approaches in some detail, and conclude that . attempts to build on [NaiveBayes ] success by relaxing the independence assumption have had mixed results. This paper is organized as ....

Langley, P. (1993). Induction of recursive bayesian classifiers. In Proceedings of the 8th European Conference on Machine Learning (pp. 153--164). Vienna: Springer-Verlag.


Learning Bayesian Prototype Trees by Simulated Annealing - Myllymäki, Tirri (1994)   (Correct)

....of a latent hidden random clustering variable, which forms the root of the prototype tree model, and the actual problem domain random variables are the leaves of the tree. The approach resembles the discrete version of the AutoClass model [4] the simple Bayesian classifier model in [14] and [15], and the finite mixture model of multivariate Bernoulli distributions in [10] but unlike these models, our network can be used also for general regression tasks, not only for classification. It is important to notice that by introducing the artificial clustering variable, we do not have to limit ....

Langley, P., Induction of Recursive Bayesian Classifiers. Pp. 153--164 in P.B. Brazdil (ed.), Proc. of ECML-93, European Conference on Machine Learning, Vienna, Austria, April 5--7, 1993 (Springer-Verlag).


MDL Learning of Probabilistic Neural Networks for Discrete.. - Tirri, Myllymäki (1994)   (Correct)

....Figure 2: The probabilistic neural network implementing Bayesian CBR. our approach we restrict ourselves to discrete problem domains, which allows us to drop all assumptions about the form of the underlying probability distribution. The approach resembles the simple Bayesian classifier in [6] and [7], but unlike these models, our network can be used also for general regression tasks (case adaptation) and not only for classification tasks (case matching) In principle the simple Bayesian classifier model corresponds to the first three layers of the network shown in Fig. 2. In [9] we left ....

P. Langley, Induction of Recursive Bayesian Classifiers. Pp. 153--164 in P.B. Brazdil (ed.), Proc. of ECML-93, European Conference on Machine Learning, Vienna, Austria, April 5--7, 1993 (Springer-Verlag).


Recognition and Exploitation of Contextual Clues via Incremental.. - Widmer (1996)   (13 citations)  (Correct)

....task. A simple Bayesian classifier is a probabilistic classification scheme that uses Bayes theorem to determine the probability that an instance belongs to a particular class, given the instance s description. A very readable introduction to simple Bayesian classifiers can be found, e.g. in (Langley, 1993). In the following, we assume that examples are described in terms of (discrete) attributes a i ; we will use the term feature for a specific attribute value combination, notated as a i : v ij . Examples are assumed to belong to mutually exclusive classes c i . Bayes theorem defines the posterior ....

....two mechanisms are thus quite different, both in intent and in effect. Finally, the capability of quasi contextual learning discussed in section 4. 6 also places MetaL(B) in the vicinity of systems that try to increase the representational power of Bayesian classifiers (e.g. Kononenko, 1991; Langley, 1993). 6 Discussion and directions for further research The main contribution of this paper is to have shown that it is indeed possible for an incremental learner to autonomously detect, during on line object level learning, contextual clues in the data if such exist. The key is an operational ....

Langley, P. (1993). Induction of Recursive Bayesian Classifiers. In Proceedings of the European Conference on Machine Learning (ECML-93), Vienna, Austria.


Combining Classifiers by Constructive Induction - Gama (1998)   (3 citations)  (Correct)

....k = min(10; nr: different values) This heuristic was used in [8] and elsewhere with good overall results. Missing values were treated as another possible value for the attribute, both on the training and test data. Naive Bayes uses all the attributes in order to classify a query point. Langley [14] refers that Naive Bayes relies on an important assumption: that the variability of the dataset can be summarized by a single probabilistic description, and that these is sufficient to distinguish between classes. From an analysis of Bias Variance, this implies that Naive Bayes uses a reduced set ....

Langley P. (1993) "Induction of recursive Bayesian Classifiers", in Machine Learning: ECML-93 Ed. P.Brazdil, LNAI n667, Springer Verlag


Prototype Selection for Composite Nearest Neighbor Classifiers - Skalak (1995)   (10 citations)  (Correct)

.... leaves) Buntine argues that combining multiple trees can be interpreted as an approximate method to compute theoretical values that arise from a Bayesian decision theoretic model of classification [ Buntine, 1991 ] Langley has provided a framework for creating a tree of Bayesian classifiers [ Langley, 1993 ] While recursive partitioning has clear strengths, the research proposed here will focus on systems that have global scope. We discuss the reasons for this design choice in Section 2.5, where some of the disadvantages of recursive partitioning algorithms are discussed. These disadvantages ....

Langley, P. 1993. Induction of Recursive Bayesian Classifiers. In Proceedings of the 1993 European Conference on Machine Learning, Berlin. Springer Verlag. 153--164.


Adjusted Probability Naive Bayesian Induction - Webb, Pazzani (1998)   (2 citations)  (Correct)

.... manipulate the attributes to be employed prior to application of naive Bayesian induction (Kononenko, 1991; Langley Sage, 1994; Pazzani, 1996) and those that select subsets of the training examples prior to the application of naive Bayesian classification to an individual case (Kohavi, 1996; Langley, 1993). This paper presents an alternative approach that seeks instead to adjust the probabilities produced by a standard naive Bayesian classifier in order to accommodate violations of the assumptions on which it is founded. 2 Adjusted Probability Semi Naive Bayesian Induction The naive Bayesian ....

Langley, P. (1993). Induction of recursive Bayesian classifiers. In Proceedings of the 1993 European Conference on Machine Leanring, pp. 153--164, Vienna. SpringerVerlag.


Searching for Dependencies in Bayesian Classifiers - Pazzani (1996)   (26 citations)  (Correct)

....from data, since the probabilities P (C i ) and P (A k = V k j jC i ) may be estimated from the training data. To determine the most likely class of a test example, the probability of each class is computed with Equation 1. A classifier created in this manner is sometimes called a simple (Langley, 1993) or naive (Kononenko, 1990) Bayesian classifier. One important evaluation metric for machine learning methods is the predictive accuracy on unseen examples. This is measured by randomly selecting a subset of the examples in a database to use as training examples and reserving the remainder to be ....

....the training data using the attributes joined as indicated. Leave one out evaluation is used because it allows a single Bayesian classifier to be constructed on the entire training set. To classify a training example, the contribution of that example to the probability estimates is subtracted out (Langley, 1993). Each algorithm considers a set of possible operations (such as joining two attributes) and selects the operation that most improves the accuracy of the classifier as measured by leaveone out cross validation. The two algorithms differ in how they create an initial Bayesian classifier and the ....

[Article contains additional citation context not shown here]

Langley, P. (1993). Induction of recursive Bayesian classifiers. Proceedings of the 1993 European Conference on Machine Learning. (pp. 153-164). Vienna: Springer-Verlag.


Local Cascade Generalization - Gama (1998)   (3 citations)  (Correct)

....allow the construction of a CGT ree whose internal nodes are trees generated by C4.5. 4 Related Work With respect to the final model, there are clear similarities between CGLtree and Multivariate trees[5] Also the final model of CGBtree is related with the recursive naive Bayes presented in [15]. In a previous work[12] we have compared system Ltree, similar to CGLtree, with Oc1[16] and LMDT[5] The focus of this paper is on methodologies for combining classifiers. As such, we review other methods that generate and combine multiple models. 4.1 Combining Classifications We can consider ....

....on the training set: k = min(10; nr: different values) This heuristic was used in [9] and elsewhere with good overall results. Missing values were treated as another possible value for the attribute. In order to classify a query point, a Naive Bayes uses all of the available attributes. Langley [15] refers that Naive Bayes relies on an important assumption: that the variability of the 4 See [10] for a good discussion on the theme. dataset can be summarized by a single probabilistic description, and that these are sufficient to distinguish between classes. From an analysis of ....

P. Langley. Induction of recursive bayesian classifiers. In P.Brazdil, editor, Machine Learning: ECML-93. LNAI 667, Springer Verlag, 1993.


A Comparison of Induction Algorithms for Selective and.. - Singh, al. (1995)   (16 citations)  (Correct)

....to be an important class of algorithms that perform competitively with other well known induction techniques such as decision trees and neural networks. Within the machine learning community, the most widely studied Bayesian induction method is the naive Bayesian classifier (Kononenko, 1990; Langley, 1993). Despite its simplicity and the strong conditional independence assumptions it makes, the naive Bayesian classifier performs remarkably well. Within the Bayesian Artificial Intelligence community, the best known Bayesian representation is the Bayesian network (Pearl, 1988) Induction algorithms ....

....estimate P (c k ) and P (z j jc k ) We refer to this naive Bayesian classifier (that models all attributes) as naive ALL. Although its performance is remarkably good given its simplicity, this classifier is typically limited to learning classes that can be separated by a single decision boundary (Langley, 1993), and in domains in which the attributes are correlated given the class variable, its performance can be worse than other approaches which can account for such correlations. Bayesian networks can account for correlations among attributes, so they are a natural extension of the naive approach. The ....

Langley, P. (1993). Induction of Recursive Bayesian Classifiers In Proc. European Conf. on Machine Learning, 153--164. Springer Verlag.


Learning in neural networks with Bayesian prototypes - Myllymäki, Tirri (1994)   (Correct)

....Modeling the probability distribution by a set of Bayesian prototype vectors, allows us to use the theory of Bayesian networks [14, 12] to construct a tree structured Bayesian network representation of the probability distribution. The approach resembles the simple Bayesian model of [5] and [6], but unlike these models, our network can be used also for general regression tasks (adaptation) and not only for classification tasks (prototype matching) What is more, these tasks can be performed efficiently on a massively parallel neural network architecture, as there is a direct mapping ....

.... vectors and prior probabilities, can be used for constructing a multi layer feedforward neural network which models the problem domain probability distribution (see Figure 3) The first three layers of the network perform prototype matching, corresponding to the simple Bayesian classifier model in [5, 6], and the last three layers perform Bayesian adaptation. For more details of the functioning of the model, see [10, 11] III. Evaluating prototype trees Let M denote a Bayesian prototype tree constructed as described in the previous section using a data set D. Assuming that the training set D ....

P. Langley, Induction of Recursive Bayesian Classifiers. Pp. 153--164 in P.B. Brazdil (ed.), Proc. of ECML-93, European Conference on Machine Learning, Vienna, Austria, April 5--7, 1993 (Springer-Verlag).


Improving Simple Bayes - Kohavi, Becker, Sommerfield (1997)   (17 citations)  (Correct)

....here) showed insignificant differences in accuracy between the two methods. Many researchers have noted the good performance of SBC, including Clark and Niblett [4] Kononenko [17] Langley and Sage [19] and Domingos and Pazzani [5] Proposed extensions generally resulted in little improvements [16, 18, 22], although some recent proposals seem promising [9, 13] 5 Summary We studied different options for handling unknowns, estimating probabilities, and discretizing. Through a large scale comparison of 37 datasets, we were able to pinpoint interesting datasets where error differences were ....

Pat Langley. Induction of recursive bayesian classifiers. In Proceedings of the European Conference on Machine Learning, pages 153--164, April 1993.


Relevance and Insight in Experimental Studies - Pat Langley (1996)   (2 citations)  Self-citation (Langley)   (Correct)

No context found.

P. Langley, "Induction of Recursive Bayesian Classifiers," Proc. 1993 European Conf. on Machine Learning, Springer-Verlag, Vienna, 1993, pp. 153-164.


Induction of Selective Bayesian Classifiers - Langley, Sage (1994)   (74 citations)  Self-citation (Langley)   (Correct)

....one assume some general form or model, with a common choice being the normal distribution, which can be conveniently represented entirely in terms of its mean and variance. 1. We have borrowed this term from Kononenko (1990) other common names for the method include the simple Bayesian classifier (Langley, 1993) and idiot Bayes (Buntine, 1990) Selective Bayesian Classifiers 400 To classify a new instance I, a naive Bayesian classifier applies Bayes theorem to determine the probability of each description given the instance, p(C i jI) p(C i )p(IjC i ) p(I) However, since I is a conjunction of ....

....complex story, but the effect is nearly the same. Nevertheless, like perceptrons, Bayesian classifiers are Selective Bayesian Classifiers 401 typically limited to learning classes that can be separated by a single decision boundary. 3 Although we have addressed this limitation in other work (Langley, 1993), we will not focus on it here. Another important assumption that the naive Bayesian classifier makes is that, within each class, the probability distributions for attributes are independent of each other. One can model attribute dependence within the Bayesian framework (Pearl, 1988) but ....

[Article contains additional citation context not shown here]

Langley, P. (1993). Induction of recursive Bayesian classifiers. Proceedings of the 1993 European Conference on Machine Learning (pp. 153--164). Vienna: Springer-Verlag.


Relevance and Insight in Experimental Studies - Pat Langley (1996)   (2 citations)  Self-citation (Langley)   (Correct)

....training data to estimate the conditional probabilities of attribute values given the class. Because naive Bayes assumes that each attribute is conditionally independent, given the class, it would seem easy to improve upon by using more sophisticated methods. However, both Kononenko (1991) and Langley (1993) report little or no improvement with extensions to naive Bayes on a number of real world data sets. Their studies, although giving negative results, were relevant in that they tested their intuitions on natural domains. However, experimental studies on natural domains alone do not satisfy our ....

....for results, and thus do not provide the understanding about causes that we expect in science. Insight is best obtained by running experiments on synthetic domains that have been designed to test explicit hypotheses, typically motivated by the intuitions behind the original extension. For example, Langley (1993) reports experiments on synthetic domains that involve target concepts with disjoint decision regions, which violate another assumption made by naive Bayes. The importance of using synthetic domains in not because they let one generate some new task, but because they let one vary systematically ....

Langley, P. (1993). Induction of recursive Bayesian classifiers. Proceedings of the 1993 European Conference on Machine Learning (pp. 153--164). Vienna: Springer-Verlag.


Induction of Selective Bayesian Network Classifiers - Singh, al. (1996)   (1 citation)  Self-citation (Langley)   (Correct)

....that are simpler to evaluate, but that still have high predictive accuracy relative to networks induced with all features. In addition to comparing our new algorithm against existing methods for learning Bayesian networks, we compare it against the naive Bayesian classifier (Kononenko, 1990; Langley, 1993), one of the most widely studied Bayesian methods within the machine learning community. Despite its simplicity and its strong assumption of conditional independence, the naive Bayesian classifier performs remarkably well, so it is an important reference point for the network classifier. Since a ....

.... assumption that attributes are independent within each class, the naive classifier has been shown to give remarkably high accuracies in many natural domains (Langley et al. 1992) However, this approach is typically limited to learning classes that can be separated by a single decision boundary (Langley, 1993), and it can suffer in domains in which the features are correlated given the class variable. The selective naive Bayesian classifier (Langley Sage, 1994a) is an extension to the naive Bayesian classifier, and is designed to perform better in domains with redundant features. The intuition is ....

Langley, P. (1993). Induction of recursive Bayesian classifiers. In Proc. European Conf. on Machine Learning, pages 153--164. Springer Verlag.


Edited Naive Bayes - Martnez-Otzeta Sierra Lazkano   (Correct)

No context found.

Langley, P. (1993). Induction of recursive bayesian classifiers. Proceedings of the European Conference on Machine Learning. Lecture Notes in Computer Science, 667:153--164.


Averaged One-Dependence Estimators: Preliminary Results - Webb, Boughton, Wang (2002)   (Correct)

No context found.

P. Langley. Induction of recursive Bayesian classifiers. In Proceedings of the 1993.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC