Results 1  10
of
13
Supervised modelbased visualization of highdimensional data
, 2000
"... When highdimensional data vectors are visualized on a two or threedimensional display, the goal is that two vectors close to each other in the multidimensional space should also be close to each other in the lowdimensional space. Traditionally, closeness is defined in terms of some standard ge ..."
Abstract

Cited by 19 (9 self)
 Add to MetaCart
When highdimensional data vectors are visualized on a two or threedimensional display, the goal is that two vectors close to each other in the multidimensional space should also be close to each other in the lowdimensional space. Traditionally, closeness is defined in terms of some standard geometric distance measure, such as the Euclidean distance, based on a more or less straightforward comparison between the contents of the data vectors. However, such distances do not generally reflect properly the properties of complex problem domains, where changing one bit in a vector may completely change the relevance of the vector. What is more, in realworld situations the similarity of two vectors is not a universal property: even if two vectors can be regarded as similar from one point of view, from another point of view they may appear quite dissimilar. In order to capture these requirements for building a pragmatic and flexible similarity measure, we propose a data visualization scheme where the similarity of two vectors is determined indirectly by using a formal model of the problem domain; in our case, a Bayesian network model. In this scheme, two vectors are considered similar if they lead to similar predictions, when given as input to a Bayesian network model. The scheme is supervised in the sense that different perspectives can be taken into account by using different predictive distributions, i.e., by changing what is to be predicted. In addition, the modeling framework can also be used for validating the rationality of the resulting visualization. This modelbased visualization scheme has been implemented and tested on realworld domains with encouraging results.
NML Computation Algorithms for TreeStructured Multinomial Bayesian Networks
, 2007
"... Typical problems in bioinformatics involve large discrete datasets. Therefore, in order to apply statistical methods in such domains, it is important to develop efficient algorithms suitable for discrete data. The minimum description length (MDL) principle is a theoretically wellfounded, general fr ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
Typical problems in bioinformatics involve large discrete datasets. Therefore, in order to apply statistical methods in such domains, it is important to develop efficient algorithms suitable for discrete data. The minimum description length (MDL) principle is a theoretically wellfounded, general framework for performing statistical inference. The mathematical formalization of MDL is based on the normalized maximum likelihood (NML) distribution, which has several desirable theoretical properties. In the case of discrete data, straightforward computation of the NML distribution requires exponential time with respect to the sample size, since the definition involves a sum over all the possible data samples of a fixed size. In this paper, we first review some existing algorithms for efficient NML computation in the case of multinomial and naive Bayes model families. Then we proceed by extending these algorithms to more complex, treestructured Bayesian networks.
Computing the Regret Table for Multinomial Data
, 2005
"... Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as model selection or data clustering. In the case ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as model selection or data clustering. In the case
A Fast Normalized Maximum Likelihood Algorithm for Multinomial Data
 In Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence (IJCAI05
, 2005
"... Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as model selection or data clustering. In the case of multinomial data ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
Stochastic complexity of a data set is defined as the shortest possible code length for the data obtainable by using some fixed set of models. This measure is of great theoretical and practical importance as a tool for tasks such as model selection or data clustering. In the case of multinomial data, computing the modern version of stochastic complexity, defined as the Normalized Maximum Likelihood (NML) criterion, requires computing a sum with an exponential number of terms. Furthermore, in order to apply NML in practice, one often needs to compute a whole table of these exponential sums. In our previous work, we were able to compute this table by a recursive algorithm. The purpose of this paper is to significantly improve the time complexity of this algorithm. The techniques used here are based on the discrete Fourier transform and the convolution theorem.
Using Bayesian Networks For Visualizing HighDimensional Data
, 1999
"... A Bayesian (belief) network is a representation of a probability distribution over a set of random variables. One of the main advantages of this model family is that it offers a theoretically solid machine learning framework for constructing accurate domain models from sample data efficiently and re ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
A Bayesian (belief) network is a representation of a probability distribution over a set of random variables. One of the main advantages of this model family is that it offers a theoretically solid machine learning framework for constructing accurate domain models from sample data efficiently and reliably. As the parameters of a Bayesian network have a precise semantic interpretation, the learned models can be used for data mining purposes, i.e., for examining regularities found in the data. In addition to this type of direct examination of the model, we suggest that the learned Bayesian networks can also be used for indirect data mining purposes through a visualization scheme which can be used for producing 2D or 3D representations of highdimensional problem domains. Our visualization scheme is based on the predictive distributions produced by the Bayesian network model, which means that the resulting visualizations can also be used as a postprocessing tool for visual inspection of ...
Assessing Case Value in CaseBased Reasoning with Adaptation
 World Multiconference on Systemics, Cybernetics and Informatics (IIIS99
, 1999
"... This paper explores new experimental evidence examining the relationship between cases and result quality in CaseBased Reasoning systems where adaptation is involved. This evidence comes from two domains; a Travelling Salesman Problem (TSP) solver and a system devising Nurse Rosters. It concludes t ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
This paper explores new experimental evidence examining the relationship between cases and result quality in CaseBased Reasoning systems where adaptation is involved. This evidence comes from two domains; a Travelling Salesman Problem (TSP) solver and a system devising Nurse Rosters. It concludes that there is no simple relationship between cases and problems that can guarantee solution quality. Instead, case/adaptation pairs need to be tested in relation to particular problems to choose the best. We believe that this result has implications for case selection and adaptation in other domains. Keywords: CaseBased Reasoning, Adaptation, Case Selection, TSP, Rostering 1. Introduction All CBR systems attempt to find the 'best' case(s) in the casebase, to suit a new problem, using similarity measures in one way or another [9,11]. Because of the bias towards retrieval in CBR systems implemented up to the present (see [16] p34 for instance), the decision on which is the best is usually m...
Unsupervised Bayesian Visualization of HighDimensional Data
 In
, 2000
"... We propose a data reduction method based on a probabilistic similarity framework where two vectors are considered similar if they lead to similar predictions. We show how this type of a probabilistic similarity metric can be defined both in a supervised and unsupervised manner. As a concrete applica ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
We propose a data reduction method based on a probabilistic similarity framework where two vectors are considered similar if they lead to similar predictions. We show how this type of a probabilistic similarity metric can be defined both in a supervised and unsupervised manner. As a concrete application of the suggested multidimensional scaling scheme, we describe how the method can be used for producing visual images of highdimensional data, and give several examples of visualizations obtained by using the suggested scheme with probabilistic Bayesian network models. 1. INTRODUCTION Multidimensional scaling (see, e.g., [3, 2]) is a data compression or data reduction task where the goal is to replace the original highdimensional data vectors with much shorter vectors, while losing as little information as possible. Intuitively speaking, it can be argued that a pragmatically sensible data reduction scheme is such that two vectors close to each other in the original multidimensional s...
Polychotomiser for Casebased Reasoning beyond the Traditional Bayesian Classification Approach
"... This work implements an enhanced Bayesian classifier with better performance as compared to the ordinary naïve Bayes classifier when used with domains and datasets of varying characteristics. Text classification is an active and ongoing research field of Artificial Intelligence (AI). Text classific ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
This work implements an enhanced Bayesian classifier with better performance as compared to the ordinary naïve Bayes classifier when used with domains and datasets of varying characteristics. Text classification is an active and ongoing research field of Artificial Intelligence (AI). Text classification is defined as the task of learning methods for categorising collections of electronic text documents into their annotated classes, based on its contents. An increasing number of statistical approaches have been developed for text classification, including knearest neighbor classification, naïve Bayes classification, decision tree, rules induction, and the algorithm implementing the structural risk minimisation theory called the support vector machine. Among the approaches used in these applications, naïve Bayes classifiers have been widely used because of its simplicity. However this generative method has been reported to be less accurate than the discriminative methods such as SVM. Some researches have proven that the naïve Bayes classifier performs surprisingly well in many other domains with certain specialised characteristics. The main aim of this work is to quantify the weakness of traditional naïve Bayes classification and introduce an enhance Bayesian classification approach with additional innovative techniques to perform better than the traditional naïve Bayes classifier. Our research goal is to develop an
1. NORMALIZED MAXIMUM LIKELIHOOD Let
"... Bayesian networks are parametric models for multidimensional domains exhibiting complex dependencies between the dimensions (domain variables). A central problem in learning such models is how to regularize the number of parameters; in other words, how to determine which dependencies are significant ..."
Abstract
 Add to MetaCart
(Show Context)
Bayesian networks are parametric models for multidimensional domains exhibiting complex dependencies between the dimensions (domain variables). A central problem in learning such models is how to regularize the number of parameters; in other words, how to determine which dependencies are significant and which are not. The normalized maximum likelihood (NML) distribution or code offers an informationtheoretic solution to this problem. Unfortunately, computing it for arbitrary Bayesian network models appears to be computationally infeasible, but recent results have showed that it can be computed efficiently for certain restricted type of Bayesian network models. In this review paper we summarize the main results.