Results 1  10
of
148
Dynamic Bayesian Networks: Representation, Inference and Learning
, 2002
"... Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have bee ..."
Abstract

Cited by 770 (3 self)
 Add to MetaCart
Modelling sequential data is important in many areas of science and engineering. Hidden Markov models (HMMs) and Kalman filter models (KFMs) are popular for this because they are simple and flexible. For example, HMMs have been used for speech recognition and biosequence analysis, and KFMs have been used for problems ranging from tracking planes and missiles to predicting the economy. However, HMMs
and KFMs are limited in their “expressive power”. Dynamic Bayesian Networks (DBNs) generalize HMMs by allowing the state space to be represented in factored form, instead of as a single discrete random variable. DBNs generalize KFMs by allowing arbitrary probability distributions, not just (unimodal) linearGaussian. In this thesis, I will discuss how to represent many different kinds of models as DBNs, how to perform exact and approximate inference in DBNs, and how to learn DBN models from sequential data.
In particular, the main novel technical contributions of this thesis are as follows: a way of representing
Hierarchical HMMs as DBNs, which enables inference to be done in O(T) time instead of O(T 3), where T is the length of the sequence; an exact smoothing algorithm that takes O(log T) space instead of O(T); a simple way of using the junction tree algorithm for online inference in DBNs; new complexity bounds on exact online inference in DBNs; a new deterministic approximate inference algorithm called factored frontier; an analysis of the relationship between the BK algorithm and loopy belief propagation; a way of
applying RaoBlackwellised particle filtering to DBNs in general, and the SLAM (simultaneous localization
and mapping) problem in particular; a way of extending the structural EM algorithm to DBNs; and a variety of different applications of DBNs. However, perhaps the main value of the thesis is its catholic presentation of the field of sequential data modelling.
Approximate Policy Iteration with a Policy Language Bias
 Journal of Artificial Intelligence Research
, 2003
"... We explore approximate policy iteration (API), replacing the usual costfunction learning step with a learning step in policy space. We give policylanguage biases that enable solution of very large relational Markov decision processes (MDPs) that no previous technique can solve. ..."
Abstract

Cited by 140 (18 self)
 Add to MetaCart
(Show Context)
We explore approximate policy iteration (API), replacing the usual costfunction learning step with a learning step in policy space. We give policylanguage biases that enable solution of very large relational Markov decision processes (MDPs) that no previous technique can solve.
Generalizing plans to new environments in relational MDPs
 In International Joint Conference on Artificial Intelligence (IJCAI03
, 2003
"... A longstanding goal in planning research is the ability to generalize plans developed for some set of environments to a new but similar environment, with minimal or no replanning. Such generalization can both reduce planning time and allow us to tackle larger domains than the ones tractable for dire ..."
Abstract

Cited by 113 (2 self)
 Add to MetaCart
(Show Context)
A longstanding goal in planning research is the ability to generalize plans developed for some set of environments to a new but similar environment, with minimal or no replanning. Such generalization can both reduce planning time and allow us to tackle larger domains than the ones tractable for direct planning. In this paper, we present an approach to the generalization problem based on a new framework of relational Markov Decision Processes (RMDPs). An RMDP can model a set of similar environments by representing objects as instances of different classes. In order to generalize plans to multiple environments, we define an approximate value function specified in terms of classes of objects and, in a multiagent setting, by classes of agents. This classbased approximate value function is optimized relative to a sampled subset of environments, and computed using an efficient linear programming method. We prove that a polynomial number of sampled environments suffices to achieve performance close to the performance achievable when optimizing over the entire space. Our experimental results show that our method generalizes plans successfully to new, significantly larger, environments, with minimal loss of performance relative to environmentspecific planning. We demonstrate our approach on a real strategic computer war game. 1
Exploiting Structure to Efficiently Solve Large Scale Partially Observable Markov Decision Processes
, 2005
"... Partially observable Markov decision processes (POMDPs) provide a natural and principled framework to model a wide range of sequential decision making problems under uncertainty. To date, the use of POMDPs in realworld problems has been limited by the poor scalability of existing solution algorithm ..."
Abstract

Cited by 91 (6 self)
 Add to MetaCart
Partially observable Markov decision processes (POMDPs) provide a natural and principled framework to model a wide range of sequential decision making problems under uncertainty. To date, the use of POMDPs in realworld problems has been limited by the poor scalability of existing solution algorithms, which can only solve problems with up to ten thousand states. In fact, the complexity of finding an optimal policy for a finitehorizon discrete POMDP is PSPACEcomplete. In practice, two important sources of intractability plague most solution algorithms: large policy spaces and large state spaces. On the other hand,
Probabilistic reasoning with answer sets
 In Proceedings of LPNMR7
, 2004
"... Abstract. We give a logic programming based account of probability and describe a declarative language Plog capable of reasoning which combines both logical and probabilistic arguments. Several nontrivial examples illustrate the use of Plog for knowledge representation. 1 ..."
Abstract

Cited by 91 (11 self)
 Add to MetaCart
Abstract. We give a logic programming based account of probability and describe a declarative language Plog capable of reasoning which combines both logical and probabilistic arguments. Several nontrivial examples illustrate the use of Plog for knowledge representation. 1
Learning symbolic models of stochastic domains
 Journal of Artificial Intelligence Research
"... In this article, we work towards the goal of developing agents that can learn to act in complex worlds. We develop a probabilistic, relational planning rule representation that compactly models noisy, nondeterministic action effects, and show how such rules can be effectively learned. Through experi ..."
Abstract

Cited by 86 (3 self)
 Add to MetaCart
In this article, we work towards the goal of developing agents that can learn to act in complex worlds. We develop a probabilistic, relational planning rule representation that compactly models noisy, nondeterministic action effects, and show how such rules can be effectively learned. Through experiments in simple planning domains and a 3D simulated blocks world with realistic physics, we demonstrate that this learning algorithm allows agents to effectively model world dynamics. 1.
Learning partially observable deterministic action models
 In Proc. Nineteenth International Joint Conference on Artificial Intelligence (IJCAI ’05
, 2005
"... We present exact algorithms for identifying deterministicactions ’ effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model (the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenari ..."
Abstract

Cited by 55 (2 self)
 Add to MetaCart
We present exact algorithms for identifying deterministicactions ’ effects and preconditions in dynamic partially observable domains. They apply when one does not know the action model (the way actions affect the world) of a domain and must learn it from partial observations over time. Such scenarios are common in real world applications. They are challenging for AI tasks because traditional domain structures that underly tractability (e.g., conditional independence) fail there (e.g., world features become correlated). Our work departs from traditional assumptions about partial observations and action models. In particular, it focuses on problems in which actions are deterministic of simple logical structure and observation models have all features observed with some frequency. We yield tractable algorithms for the modified problem for such domains. Our algorithms take sequences of partial observations over time as input, and output deterministic action models that could have lead to those observations. The algorithms output all or one of those models (depending on our choice), and are exact in that no model is misclassified given the observations. Our algorithms take polynomial time in the number of time steps and state features for some traditional action classes examined in the AIplanning literature, e.g., STRIPS actions. In contrast, traditional approaches for HMMs and Reinforcement Learning are inexact and exponentially intractable for such domains. Our experiments verify the theoretical tractability guarantees, and show that we identify action models exactly. Several applications in planning, autonomous exploration, and adventuregame playing already use these results. They are also promising for probabilistic settings, partially observable reinforcement learning, and diagnosis. 1.
SMDP Homomorphisms: An Algebraic Approach to Abstraction in SemiMarkov Decision Processes
, 2003
"... To operate effectively in complex environments learning agents require the ability to selectively ignore irrelevant details and form useful abstractions. ..."
Abstract

Cited by 52 (9 self)
 Add to MetaCart
To operate effectively in complex environments learning agents require the ability to selectively ignore irrelevant details and form useful abstractions.
Dynamic Probabilistic Relational Models
, 2003
"... Intelligent agents must function in an uncertain world, containing multiple objects and relations that change over time. Unfortunately, no representation is currently available that can handle all these issues, while allowing for principled and efficient inference. This paper addresses this nee ..."
Abstract

Cited by 51 (5 self)
 Add to MetaCart
Intelligent agents must function in an uncertain world, containing multiple objects and relations that change over time. Unfortunately, no representation is currently available that can handle all these issues, while allowing for principled and efficient inference. This paper addresses this need by introducing dynamic probabilistic relational models (DPRMs). DPRMs are an extension of dynamic Bayesian networks (DBNs) where each time slice (and its dependences on previous slices) is represented by a probabilistic relational model (PRM). Particle filtering, the standard method for inference in DBNs, has severe limitations when applied to DPRMs, but we are able to greatly improve its performance through a form of relational RaoBlackwellisation. Further gains in efficiency are obtained through the use of abstraction trees, a novel data structure. We successfully apply DPRMs to execution monitoring and fault diagnosis of an assembly plan, in which a complex product is gradually constructed from subparts.
Inductive policy selection for firstorder MDPs. In
 In Proceedings of Eighteenth Conference in Uncertainty in Artificial Intelligence.
, 2002
"... Abstract We select policies for large Markov Decision Processes (MDPs) with compact firstorder representations. We find policies that generalize well as the number of objects in the domain grows, potentially without bound. Existing dynamicprogramming approaches based on flat, propositional, or fi ..."
Abstract

Cited by 49 (17 self)
 Add to MetaCart
(Show Context)
Abstract We select policies for large Markov Decision Processes (MDPs) with compact firstorder representations. We find policies that generalize well as the number of objects in the domain grows, potentially without bound. Existing dynamicprogramming approaches based on flat, propositional, or firstorder representations either are impractical here or do not naturally scale as the number of objects grows without bound. We implement and evaluate an alternative approach that induces firstorder policies using training data constructed by solving small problem instances using PGraphplan