Results 1 - 10
of
16
Uncertain<t>: A first-order type for uncertain data. In
- ASPLOS,
, 2014
"... Abstract Sampled data from sensors, the web, and people is inherently probabilistic. Because programming languages use discrete types (floats, integers, and booleans), applications, ranging from GPS navigation to web search to polling, express and reason about uncertainty in idiosyncratic ways. Thi ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
(Show Context)
Abstract Sampled data from sensors, the web, and people is inherently probabilistic. Because programming languages use discrete types (floats, integers, and booleans), applications, ranging from GPS navigation to web search to polling, express and reason about uncertainty in idiosyncratic ways. This mismatch causes three problems. (1) Using an estimate as a fact introduces errors (walking through walls). (2) Computation on estimates compounds errors (walking at 59 mph). (3) Inference asks questions incorrectly when the data can only answer probabilistic question (e.g., "are you speeding?" versus "are you speeding with high probability"). This paper introduces the uncertain type (Uncertain T ), an abstraction that expresses, propagates, and exposes uncertainty to solve these problems. We present its semantics and a recipe for (a) identifying distributions, (b) computing, (c) inferring, and (d) leveraging domain knowledge in uncertain data. Because Uncertain T computations express an algebra over probabilities, Bayesian statistics ease inference over disparate information (physics, calendars, and maps). Uncertain T leverages statistics, learning algorithms, and domain expertise for experts and abstracts them for nonexpert developers. We demonstrate Uncertain T on two applications. The result is improved correctness, productivity, and expressiveness for probabilistic data.
Deriving Probability Density Functions from Probabilistic Functional Programs
"... Abstract. The probability density function of a probability distribution is a fundamental concept in probability theory and a key ingredient in various widely used machine learning methods. However, the necessary framework for compiling probabilistic functional programs to density functions has only ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Abstract. The probability density function of a probability distribution is a fundamental concept in probability theory and a key ingredient in various widely used machine learning methods. However, the necessary framework for compiling probabilistic functional programs to density functions has only recently been developed. In this work, we present a density compiler for a probabilistic language with discrete and continuous distributions, and discrete observations, and provide a proof of its soundness. The compiler greatly reduces the development effort of domain experts, which we demonstrate by solving inference problems from various scientific applications, such as modelling the global carbon cycle, using a standard Markov chain Monte Carlo framework. 1
Probabilistic relational verification for cryptographic implementations,” Unpublished manuscript
, 2013
"... Relational program logics have been used for mechanizing for-mal proofs of various cryptographic constructions. With an eye to-wards scaling these successes towards end-to-end security proofs for implementations of distributed systems, we present RF⋆, a rela-tional extension of F⋆, a general-purpose ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
(Show Context)
Relational program logics have been used for mechanizing for-mal proofs of various cryptographic constructions. With an eye to-wards scaling these successes towards end-to-end security proofs for implementations of distributed systems, we present RF⋆, a rela-tional extension of F⋆, a general-purpose higher-order stateful pro-gramming language with a verification system based on refinement types. The distinguishing feature of RF ⋆ is a relational Hoare logic for a higher-order, stateful, probabilistic language. Through care-ful language design, we adapt the F ⋆ typechecker to generate both classic and relational verification conditions, and to automatically discharge their proofs using an SMT solver. Thus, we are able to benefit from the existing features of F⋆, including its abstraction facilities for modular reasoning about program fragments. We eval-uate RF ⋆ experimentally by programming a series of cryptographic constructions and protocols, and by verifying their security proper-ties, ranging from information flow to unlinkability, integrity, and privacy. Moreover, we validate the design of RF ⋆ by formalizing in Coq a core probabilistic λ-calculus and a relational refinement type system and proving the soundness of the latter against a deno-tational semantics of the probabilistic λ-calculus.
Slicing probabilistic programs
- In PLDI
, 2014
"... Abstract Probabilistic programs use familiar notation of programming languages to specify probabilistic models. Suppose we are interested in estimating the distribution of the return expression r of a probabilistic program P . We are interested in slicing the probabilistic program P and obtaining a ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Abstract Probabilistic programs use familiar notation of programming languages to specify probabilistic models. Suppose we are interested in estimating the distribution of the return expression r of a probabilistic program P . We are interested in slicing the probabilistic program P and obtaining a simpler program SLI(P ) which retains only those parts of P that are relevant to estimating r, and elides those parts of P that are not relevant to estimating r. We desire that the SLI transformation be both correct and efficient. By correct, we mean that P and SLI(P ) have identical estimates on r. By efficient, we mean that estimation over SLI(P ) be as fast as possible. We show that the usual notion of program slicing, which traverses control and data dependencies backward from the return expression r, is unsatisfactory for probabilistic programs, since it produces incorrect slices on some programs and sub-optimal ones on others. Our key insight is that in addition to the usual notions of control dependence and data dependence that are used to slice nonprobabilistic programs, a new kind of dependence called observe dependence arises naturally due to observe statements in probabilistic programs. We propose a new definition of SLI(P ) which is both correct and efficient for probabilistic programs, by including observe dependence in addition to control and data dependences for computing slices. We prove correctness mathematically, and we demonstrate efficiency empirically. We show that by applying the SLI transformation as a pre-pass, we can improve the efficiency of probabilistic inference, not only in our own inference tool R2, but also in other systems for performing inference such as Church and Infer.NET.
WOLFE: Strength Reduction and Approximate Programming for Probabilistic Programming
"... Existing modeling languages lack the expressiveness or ef-ficiency to support many modern and successful machine learning (ML) models such as structured prediction or ma-trix factorization. We present WOLFE, a probabilistic pro-gramming language that enables practitioners to develop such models. Mos ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Existing modeling languages lack the expressiveness or ef-ficiency to support many modern and successful machine learning (ML) models such as structured prediction or ma-trix factorization. We present WOLFE, a probabilistic pro-gramming language that enables practitioners to develop such models. Most ML approaches can be formulated in terms of scalar objectives or scoring functions (such as distribu-tions) and a small set of mathematical operations such as maximization and summation. In WOLFE, the user works within a functional host language to declare scalar functions and invoke mathematical operators. The WOLFE compiler then replaces the operators with equivalent, but more efficient (strength reduction) and/or approximate (approximate pro-gramming) versions to generate low-level inference or learn-ing code. This approach can yield very concise programs, high expressiveness and efficient execution.
Bayesian inference using data flow analysis
- In ESEC/SIGSOFT FSE
, 2013
"... ABSTRACT We present a new algorithm for Bayesian inference over probabilistic programs, based on data flow analysis techniques from the program analysis community. Unlike existing techniques for Bayesian inference on probabilistic programs, our data flow analysis algorithm is able to perform infere ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
ABSTRACT We present a new algorithm for Bayesian inference over probabilistic programs, based on data flow analysis techniques from the program analysis community. Unlike existing techniques for Bayesian inference on probabilistic programs, our data flow analysis algorithm is able to perform inference directly on probabilistic programs with loops. Even for loop-free programs, we show that data flow analysis offers better precision and better performance benefits over existing techniques. We also describe heuristics that are crucial for our inference to scale, and present an empirical evaluation of our algorithm over a range of benchmarks.
Modeling, Quantifying, and Limiting Adversary Knowledge
, 2015
"... Users participating in online services are required to relinquish control over potentially sensitive personal information, exposing them to intentional or unintentional miss-use of said information by the service providers. Users wishing to avoid this must either abstain from often extremely useful ..."
Abstract
- Add to MetaCart
Users participating in online services are required to relinquish control over potentially sensitive personal information, exposing them to intentional or unintentional miss-use of said information by the service providers. Users wishing to avoid this must either abstain from often extremely useful services, or provide false information which is usually contrary to the terms of service they must abide by. An attractive middle-ground alternative is to maintain control in the hands of the users and provide a mechanism with which information that is necessary for useful services can be queried. Users need not trust any external party in the management of their information but are now faced with the problem of judging when queries by service providers should be answered or when they should be refused due to revealing too much sensitive information. Judging query safety is difficult. Two queries may be benign in isolation but might reveal more than a user is comfortable with in combination. Additionally malicious adversaries who wish to learn more than allowed might query in a manner that attempts to hide the flows of sensitive information. Finally, users cannot rely on
Model Selection in Compositional Spaces
, 2014
"... We often build complex probabilistic models by composing simpler models—using one model to generate parameters or latent variables for another model. This allows us to express complex distributions over the observed data and to share statistical structure between different parts of a model. In this ..."
Abstract
- Add to MetaCart
We often build complex probabilistic models by composing simpler models—using one model to generate parameters or latent variables for another model. This allows us to express complex distributions over the observed data and to share statistical structure between different parts of a model. In this thesis, we present a space of matrix decomposition models defined by the composition of a small number of motifs of probabilistic modeling, including clustering, low rank factorizations, and binary latent factor models. This compositional structure can be represented by a context-free grammar whose production rules correspond to these motifs. By exploiting the structure of this grammar, we can generically and efficiently infer latent components and estimate predictive likelihood for nearly 2500 model structures using a small toolbox of reusable algorithms. Using a greedy search over this grammar, we au-tomatically choose the decomposition structure from raw data by evaluating only a small fraction of all models. The proposed method typically finds the correct struc-ture for synthetic data and backs off gracefully to simpler models under heavy noise.
Bayesian Machine Learning
, 2011
"... The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bay ..."
Abstract
- Add to MetaCart
(Show Context)
The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bayesian models as probabilistic programs. As a foundation for this kind of programming, we propose a core functional calculus with primitives for sampling prior distributions and observing variables. We define combinators for measure transformers, based on theorems in measure theory, and use these to give a rigorous semantics to our core calculus. The original features of our semantics include its support for discrete, continuous, and hybrid measures, and, in particular, for observations of zero-probability events. We compile our core language to a small imperative language that in addition to the measure transformer semantics also has a straightforward semantics via factor graphs, data structures that enable many efficient inference algorithms. We use an existing inference engine for efficient approximate inference of posterior marginal distributions, treating thousands of observations per second for large instances of realistic models.
Practical Probabilistic Programming with Monads
"... The machine learning community has recently shown a lot of inter-est in practical probabilistic programming systems that target the problem of Bayesian inference. Such systems come in different forms, but they all express probabilistic models as computational processes using syntax resembling progra ..."
Abstract
- Add to MetaCart
The machine learning community has recently shown a lot of inter-est in practical probabilistic programming systems that target the problem of Bayesian inference. Such systems come in different forms, but they all express probabilistic models as computational processes using syntax resembling programming languages. In the functional programming community monads are known to offer a convenient and elegant abstraction for programming with probabil-ity distributions, but their use is often limited to very simple in-ference problems. We show that it is possible to use the monad abstraction to construct probabilistic models for machine learning, while still offering good performance of inference in challenging models. We use a GADT as an underlying representation of a prob-ability distribution and apply Sequential Monte Carlo-based meth-ods to achieve efficient inference. We define a formal semantics via measure theory. We demonstrate a clean and elegant implementa-tion that achieves performance comparable with Anglican, a state-of-the-art probabilistic programming system.