Results 1  10
of
81
Pointbased value iteration: An anytime algorithm for POMDPs
, 2003
"... This paper introduces the PointBased Value Iteration (PBVI) algorithm for POMDP planning. PBVI approximates an exact value iteration solution by selecting a small set of representative belief points, and planning for those only. By using stochastic trajectories to choose belief points, and by ..."
Abstract

Cited by 346 (25 self)
 Add to MetaCart
This paper introduces the PointBased Value Iteration (PBVI) algorithm for POMDP planning. PBVI approximates an exact value iteration solution by selecting a small set of representative belief points, and planning for those only. By using stochastic trajectories to choose belief points, and by maintaining only one value hyperplane per point, it is able to successfully solve large problems, including the robotic Tag domain, a POMDP version of the popular game of lasertag.
Partially observable markov decision processes with continuous observations for dialogue management
 Computer Speech and Language
, 2005
"... This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a t ..."
Abstract

Cited by 217 (52 self)
 Add to MetaCart
(Show Context)
This work shows how a dialogue model can be represented as a Partially Observable Markov Decision Process (POMDP) with observations composed of a discrete and continuous component. The continuous component enables the model to directly incorporate a confidence score for automated planning. Using a testbed simulated dialogue management problem, we show how recent optimization techniques are able to find a policy for this continuous POMDP which outperforms a traditional MDP approach. Further, we present a method for automatically improving handcrafted dialogue managers by incorporating POMDP belief state monitoring, including confidence score information. Experiments on the testbed system show significant improvements for several example handcrafted dialogue managers across a range of operating conditions. 1
Perseus: Randomized pointbased value iteration for POMDPs
 Journal of Artificial Intelligence Research
, 2005
"... Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Pointbased approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agent’s belief space. We present a ra ..."
Abstract

Cited by 204 (17 self)
 Add to MetaCart
Partially observable Markov decision processes (POMDPs) form an attractive and principled framework for agent planning under uncertainty. Pointbased approximate techniques for POMDPs compute a policy based on a finite set of points collected in advance from the agent’s belief space. We present a randomized pointbased value iteration algorithm called Perseus. The algorithm performs approximate value backup stages, ensuring that in each backup stage the value of each point in the belief set is improved; the key observation is that a single backup may improve the value of many belief points. Contrary to other pointbased methods, Perseus backs up only a (randomly selected) subset of points in the belief set, sufficient for improving the value of each belief point in the set. We show how the same idea can be extended to dealing with continuous action spaces. Experimental results show the potential of Perseus in large scale POMDP problems. 1.
Pointbased POMDP algorithms: Improved analysis and implementation
 in Proceedings of Uncertainty in Artificial Intelligence
"... Existing complexity bounds for pointbased POMDP value iteration algorithms focus either on the curse of dimensionality or the curse of history. We derive a new bound that relies on both and uses the concept of discounted reachability; our conclusions may help guide future algorithm design. We also ..."
Abstract

Cited by 157 (3 self)
 Add to MetaCart
Existing complexity bounds for pointbased POMDP value iteration algorithms focus either on the curse of dimensionality or the curse of history. We derive a new bound that relies on both and uses the concept of discounted reachability; our conclusions may help guide future algorithm design. We also discuss recent improvements to our (pointbased) heuristic search value iteration algorithm. Our new implementation calculates tighter initial bounds, avoids solving linear programs, and makes more effective use of sparsity. Empirical results show speedups of more than two orders of magnitude. 1
Perspectives on standardization in mobile robot programming: The carnegie mellon navigation (CARMEN) toolkit
 In Proc. of the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS
, 2003
"... Abstract — In this paper we describe our opensource ..."
Abstract

Cited by 124 (5 self)
 Add to MetaCart
(Show Context)
Abstract — In this paper we describe our opensource
Anytime pointbased approximations for large pomdps
 Journal of Artificial Intelligence Research
, 2006
"... The Partially Observable Markov Decision Process has long been recognized as a rich framework for realworld planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A wellknown tech ..."
Abstract

Cited by 104 (7 self)
 Add to MetaCart
The Partially Observable Markov Decision Process has long been recognized as a rich framework for realworld planning and control problems, especially in robotics. However exact solutions in this framework are typically computationally intractable for all but the smallest problems. A wellknown technique for speeding up POMDP solving involves performing value backups at specific belief points, rather than over the entire belief simplex. The efficiency of this approach, however, depends greatly on the selection of points. This paper presents a set of novel techniques for selecting informative belief points which work well in practice. The point selection procedure is combined with pointbased value backups to form an effective anytime POMDP algorithm called PointBased Value Iteration (PBVI). The first aim of this paper is to introduce this algorithm and present a theoretical analysis justifying the choice of belief selection technique. The second aim of this paper is to provide a thorough empirical comparison between PBVI and other stateoftheart POMDP methods, in particular the Perseus algorithm, in an effort to highlight their similarities and differences. Evaluation is performed using both standard POMDP domains and realistic robotic tasks.
Visibilitybased pursuitevasion with limited field of view.
 The International Journal of Robotics Research,
, 2006
"... Abstract We study a form of the pursuitevasion problem, in which one or more searchers must move through a given environment so as to guarantee detection of any and all evaders, which can move arbitrarily fast. Our goal is to develop techniques for coordinating teams of robots to execute this task ..."
Abstract

Cited by 93 (1 self)
 Add to MetaCart
(Show Context)
Abstract We study a form of the pursuitevasion problem, in which one or more searchers must move through a given environment so as to guarantee detection of any and all evaders, which can move arbitrarily fast. Our goal is to develop techniques for coordinating teams of robots to execute this task in application domains such as clearing a building, for reasons of security or safety. To this end, we introduce a new class of searcher, the φsearcher, which can be readily instantiated as a physical mobile robot. We present a detailed analysis of the pursuitevasion problem using φsearchers. We show that computing the minimum number of φsearchers required to search a given environment is NPhard, and present the first complete search algorithm for a single φsearcher. We show how this algorithm can be extended to handle multiple searchers, and give examples of computed trajectories.
Exploiting Structure to Efficiently Solve Large Scale Partially Observable Markov Decision Processes
, 2005
"... Partially observable Markov decision processes (POMDPs) provide a natural and principled framework to model a wide range of sequential decision making problems under uncertainty. To date, the use of POMDPs in realworld problems has been limited by the poor scalability of existing solution algorithm ..."
Abstract

Cited by 91 (6 self)
 Add to MetaCart
Partially observable Markov decision processes (POMDPs) provide a natural and principled framework to model a wide range of sequential decision making problems under uncertainty. To date, the use of POMDPs in realworld problems has been limited by the poor scalability of existing solution algorithms, which can only solve problems with up to ten thousand states. In fact, the complexity of finding an optimal policy for a finitehorizon discrete POMDP is PSPACEcomplete. In practice, two important sources of intractability plague most solution algorithms: large policy spaces and large state spaces. On the other hand,
PointBased Value Iteration for Continuous POMDPs
 JOURNAL OF MACHINE LEARNING RESEARCH
, 2006
"... We propose a novel approach to optimize Partially Observable Markov Decisions Processes (POMDPs) defined on continuous spaces. To date, most algorithms for modelbased POMDPs are restricted to discrete states, actions, and observations, but many realworld problems such as, for instance, robot na ..."
Abstract

Cited by 74 (4 self)
 Add to MetaCart
We propose a novel approach to optimize Partially Observable Markov Decisions Processes (POMDPs) defined on continuous spaces. To date, most algorithms for modelbased POMDPs are restricted to discrete states, actions, and observations, but many realworld problems such as, for instance, robot navigation, are naturally defined on continuous spaces. In this work, we demonstrate that the value function for continuous POMDPs is convex in the beliefs over continuous state spaces, and piecewiselinear convex for the particular case of discrete observations and actions but still continuous states. We also demonstrate that continuous Bellman backups are contracting and isotonic ensuring the monotonic convergence of valueiteration algorithms. Relying on those properties, we extend the PERSEUS algorithm, originally developed for discrete POMDPs, to work in continuous state spaces by representing the observation, transition, and reward models using Gaussian mixtures, and the beliefs using Gaussian mixtures or particle sets. With these representations, the integrals that appear in the Bellman backup can be computed in closed form and, therefore, the algorithm is computationally feasible. Finally, we further extend PERSEUS to deal with continuous action and observation sets by designing effective sampling approaches.
Global aoptimal robot exploration in slam
 In ICRA
, 2005
"... Abstract — It is wellknown that the Kalman filter for simultaneous localization and mapping (SLAM) converges to a fully correlated map in the limit of infinite time and data [1]. However, the rate of convergence of the map has a strong dependence on the order of the observations. We show that conve ..."
Abstract

Cited by 58 (4 self)
 Add to MetaCart
(Show Context)
Abstract — It is wellknown that the Kalman filter for simultaneous localization and mapping (SLAM) converges to a fully correlated map in the limit of infinite time and data [1]. However, the rate of convergence of the map has a strong dependence on the order of the observations. We show that conventional exploration algorithms for collecting map data are suboptimal in both the objective function and choice of optimization procedure. We show that optimizing the aoptimal information measure results in a more accurate map than existing approaches, using a greedy, closedloop strategy. Secondly, we demonstrate that by restricting the planning to an appropriate policy class, we can tractably find nongreedy, global planning trajectories that produce more accurate maps, explicitly planning to close loops even in openloop scenarios. I.