Results 1  10
of
22
Bayes in the sky: Bayesian inference and model selection in cosmology
 Contemp. Phys
"... The application of Bayesian methods in cosmology and astrophysics has flourished over the past decade, spurred by data sets of increasing size and complexity. In many respects, Bayesian methods have proven to be vastly superior to more traditional statistical tools, offering the advantage of higher ..."
Abstract

Cited by 58 (7 self)
 Add to MetaCart
The application of Bayesian methods in cosmology and astrophysics has flourished over the past decade, spurred by data sets of increasing size and complexity. In many respects, Bayesian methods have proven to be vastly superior to more traditional statistical tools, offering the advantage of higher efficiency and of a consistent conceptual basis for dealing with the problem of induction in the presence of uncertainty. This trend is likely to continue in the future, when the way we collect, manipulate and analyse observations and compare them with theoretical models will assume an even more central role in cosmology. This review is an introduction to Bayesian methods in cosmology and astrophysics and recent results in the field. I first present Bayesian probability theory and its conceptual underpinnings, Bayes ’ Theorem and the role of priors. I discuss the problem of parameter inference and its general solution, along with numerical techniques such as Monte Carlo Markov Chain methods. I then review the theory and application of Bayesian model comparison, discussing the notions of Bayesian evidence and effective model complexity, and how to compute and interpret those quantities. Recent developments in cosmological parameter extraction and Bayesian cosmological model building are summarized, highlighting the challenges that lie ahead.
Lattice duality: The origin of probability and entropy
 In press: Neurocomputing
, 2005
"... Bayesian probability theory is an inference calculus, which originates from a generalization of inclusion on the Boolean lattice of logical assertions to a degree of inclusion represented by a real number. Dual to this lattice is the distributive lattice of questions constructed from the ordered set ..."
Abstract

Cited by 31 (10 self)
 Add to MetaCart
Bayesian probability theory is an inference calculus, which originates from a generalization of inclusion on the Boolean lattice of logical assertions to a degree of inclusion represented by a real number. Dual to this lattice is the distributive lattice of questions constructed from the ordered set of downsets of assertions, which forms the foundation of the calculus of inquiry—a generalization of information theory. In this paper we introduce this novel perspective on these spaces in which machine learning is performed and discuss the relationship between these results and several proposed generalizations of information theory in the literature.
Simulationbased optimal Bayesian experimental design for nonlinear systems
 Journal of Computational Physics
, 2012
"... iv ..."
(Show Context)
SMC SAMPLERS FOR BAYESIAN OPTIMAL NONLINEAR DESIGN
"... Experimental design is a fundamental problem in science. It arises in the planning of medical trials, sensor network deployment and control as well as in costly data gathering in physics, chemistry and biology. Bayesian decision theory provides a principled way of treating this problem, but leads to ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
(Show Context)
Experimental design is a fundamental problem in science. It arises in the planning of medical trials, sensor network deployment and control as well as in costly data gathering in physics, chemistry and biology. Bayesian decision theory provides a principled way of treating this problem, but leads to an intractable joint optimization and integration problem. Here, we propose a viable solution to this hard computational problem using sequential Monte Carlo samplers. 1. PROBLEM FORMULATION We assume that we have a measurement model p(yθ, d) of experimental outcomes y ∈ Y given a design d as well as a prior p(θ) on the model parameters θ ∈ Θ. The prior could be based on expert knowledge or previous experiments. The goal is then to choose the optimal design d ∗ ∈ Rp, which maximizes the expected utility U(d) = p(θ)p(yθ, d) u(y, d, θ) dy dθ (1) with respect to some measure of utility u(y, d, θ). When the model parameters are the objects of interest, the negative posterior entropy is commonly chosen as the utility function. That is, one aims to maximize U(d) = p(θ)p(yθ, d) [p(θ ′ y, d)log p(θ ′ y, d)] dθ ′ dy dθ. As shown in [1], under the assumptions of stationarity and standard bounds on distributions, this criterion is equivalent to maximizing the marginal entropy of the outcome y U(d) = C − p(yd) log p(yd) dy, (2) where C is an arbitrary constant. This transformation reduces the complexity by eliminating one parameter space integral. 2. PREVIOUS WORK The joint optimization and nested integration problem in equation (2) is computationally challenging. For this reason, most research has focused on the simple linearnormal model, for
Adaptive design optimization: A mutual information based approach to model discrimination in cognitive science
 Neural Computation
, 2010
"... Discriminating among competing statistical models is a pressing issue for many experimentalists in the field of cognitive science. Resolving this issue begins with designing maximally informative experiments. To this end, the problem to be solved in adaptive design optimization is identifying experi ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
(Show Context)
Discriminating among competing statistical models is a pressing issue for many experimentalists in the field of cognitive science. Resolving this issue begins with designing maximally informative experiments. To this end, the problem to be solved in adaptive design optimization is identifying experimental designs under which one can infer the underlying model in the fewest possible steps. When the models under consideration are nonlinear, as is often the case in cognitive science, this problem can be impossible to solve analytically without simplifying assumptions. However, as we show in this paper, a full solution can be found numerically with the help of a Bayesian computational trick derived from the statistics literature, which recasts the problem as a probability density simulation in which the optimal design is the mode of the density. We use a utility function based on mutual information, and give three intuitive interpretations of the utility function in terms of Bayesian posterior estimates. As a proof of concept, we offer a simple example application to an experiment on memory retention. 1
Robust Online Hamiltonian Learning
"... In this work we combine two distinct machine learning methodologies, sequential Monte Carlo and Bayesian experimental design, and apply them to the problem of inferring the dynamical parameters of a quantum system. The algorithm can be implemented online (during experimental data collection), avoidi ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
In this work we combine two distinct machine learning methodologies, sequential Monte Carlo and Bayesian experimental design, and apply them to the problem of inferring the dynamical parameters of a quantum system. The algorithm can be implemented online (during experimental data collection), avoiding the need for storage and postprocessing. Most importantly, our algorithm is capable of learning Hamiltonian parameters even when the parameters change from experimenttoexperiment, and also when additional noise processes are present and unknown. The algorithm also numerically estimates the CramerRao lower bound, certifying its own performance. We further illustrate the practicality of our algorithm by applying it to two test problems: (1) learning an unknown frequency and the decoherence time for a single–qubit quantum system and (2) learning couplings in a many–qubit Ising model Hamiltonian with no external magnetic field.
Autonomous Science Platforms and QuestionAsking Machines
"... Abstract—As we become increasingly reliant on remote science platforms, the ability to autonomously and intelligently perform data collection becomes critical. In this paper we view these platforms as questionasking machines and introduce a paradigm based on the scientific method, which couples the ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract—As we become increasingly reliant on remote science platforms, the ability to autonomously and intelligently perform data collection becomes critical. In this paper we view these platforms as questionasking machines and introduce a paradigm based on the scientific method, which couples the processes of inference and inquiry to form a modelbased learning cycle. Unlike modern autonomous instrumentation, the system is not programmed to collect data directly, but instead, is programmed to learn based on a set of models. Computationally, this learning cycle is implemented in software consisting of a Bayesian probabilitybased inference engine coupled to an entropybased inquiry engine. Operationally, a given experiment is viewed as a question, whose relevance is computed using the inquiry calculus, which is a natural ordertheoretic generalization of information theory. In simple cases, the relevance is proportional to the entropy. This data is then analyzed by the inference engine, which updates the state of knowledge of the instrument. This new state of knowledge is then used as a basis for future inquiry as the system continues to learn. This paper will introduce the learning methodology, describe its implementation in software, and demonstrate the process with a robotic explorer that autonomously and intelligently performs data collection to solve a searchandcharacterize problem. I.
The Spatial Sensitivity Function of a Light Sensor
"... Abstract. The Spatial Sensitivity Function (SSF) is used to quantify a detector’s sensitivity to a spatiallydistributed input signal. By weighting the incoming signal with the SSF and integrating, the overall scalar response of the detector can be estimated. This project focuses on estimating the S ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
Abstract. The Spatial Sensitivity Function (SSF) is used to quantify a detector’s sensitivity to a spatiallydistributed input signal. By weighting the incoming signal with the SSF and integrating, the overall scalar response of the detector can be estimated. This project focuses on estimating the SSF of a light intensity sensor consisting of a photodiode. This light sensor has been used previously in the Knuth Cyberphysics Laboratory on a robotic arm that performs its own experiments to locate a white circle in a dark field (Knuth et al., 2007). To use the light sensor to learn about its surroundings, the robot’s inference software must be able to model and predict the light sensor’s response to a hypothesized stimulus. Previous models of the light sensor treated it as a point sensor and ignored its spatial characteristics. Here we propose a parametric approach where the SSF is described by a mixture of Gaussians (MOG). By performing controlled calibration experiments with known stimulus inputs, we used nested sampling to estimate the SSF of the light sensor using an MOG model with the number of Gaussians ranging from one to five. By comparing the evidence computed for each MOG model, we found that one Gaussian is sufficient to describe the SSF to the accuracy we require. Future work will involve incorporating this more accurate SSF into the Bayesian machine learning software for the robotic system and studying how this detailed information about the properties of the light sensor will improve robot’s ability to learn.
Toward QuestionAsking Machines: The Logic of Questions and the Inquiry Calculus
"... For over a century, the study of logic has focused on the algebra of logical statements. This work, first performed by George Boole, has led to the development of modern computers, and was shown by Richard T. Cox to be the foundation of Bayesian inference. Meanwhile the logic of questions has been m ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
For over a century, the study of logic has focused on the algebra of logical statements. This work, first performed by George Boole, has led to the development of modern computers, and was shown by Richard T. Cox to be the foundation of Bayesian inference. Meanwhile the logic of questions has been much neglected. For our computing machines to be truly intelligent, they need to be able to ask relevant questions. In this paper I will show how the Boolean lattice of logical statements gives rise to the free distributive lattice of questions thus defining their algebra. Furthermore, there exists a quantity analogous to probability, called relevance, which quantifies the degree to which one question answers another. I will show that relevance is not only a natural generalization of information theory, but also forms its foundation. 1
Gradientbased Stochastic Optimization Methods in Bayesian Experimental Design
, 2012
"... Optimal experimental design (OED) seeks experiments expected to yield the most useful data for some purpose. In practical circumstances where experiments are timeconsuming or resourceintensive, OED can yield enormous savings. We pursue OED for nonlinear systems from a Bayesian perspective, with th ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Optimal experimental design (OED) seeks experiments expected to yield the most useful data for some purpose. In practical circumstances where experiments are timeconsuming or resourceintensive, OED can yield enormous savings. We pursue OED for nonlinear systems from a Bayesian perspective, with the goal of choosing experiments that are optimal for parameter inference. Our objective in this context is the expected information gain in model parameters, which in general can only be estimated using Monte Carlo methods. Maximizing this objective thus becomes a stochastic optimization problem. This paper develops gradientbased stochastic optimization methods for the design of experiments on a continuous parameter space. Given a Monte Carlo estimator of expected information gain, we use infinitesimal perturbation analysis to derive gradients of this estimator. We are then able to formulate two gradientbased stochastic optimization approaches: (i) RobbinsMonro stochastic approximation, and (ii) sample average approximation combined with a deterministic quasiNewton method. A polynomial chaos approximation of the forward model accelerates objective and gradient evaluations in both cases. We discuss the implementation of these optimization methods, then conduct an empirical comparison of their performance. To demonstrate design in a nonlinear setting with partial differential equation forward models, we use the problem of sensor placement for source inversion. Numerical results yield useful guidelines on the choice of algorithm and sample sizes, assess the impact of estimator bias, and quantify tradeoffs of computational cost versus solution quality and robustness. 1