Results 1 -
8 of
8
Online planning algorithms for POMDPs
- Journal of Artificial Intelligence Research
, 2008
"... Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate ..."
Abstract
-
Cited by 109 (3 self)
- Add to MetaCart
(Show Context)
Partially Observable Markov Decision Processes (POMDPs) provide a rich framework for sequential decision-making under uncertainty in stochastic domains. However, solving a POMDP is often intractable except for small problems due to their complexity. Here, we focus on online approaches that alleviate the computational complexity by computing good local policies at each decision step during the execution. Online algorithms generally consist of a lookahead search to find the best action to execute at each time step in an environment. Our objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics (return, error bound reduction, lower bound improvement). Our experimental results indicate that state-of-the-art online heuristic search methods can handle large POMDP domains efficiently. 1.
AEMS: an anytime online search algorithm for approximate policy refinement in large POMDPs
- In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI
, 2007
"... Solving large Partially Observable Markov Decision Processes (POMDPs) is a complex task which is often intractable. A lot of effort has been made to develop approximate offline algorithms to solve ever larger POMDPs. However, even stateof-the-art approaches fail to solve large POMDPs in reasonable t ..."
Abstract
-
Cited by 27 (9 self)
- Add to MetaCart
Solving large Partially Observable Markov Decision Processes (POMDPs) is a complex task which is often intractable. A lot of effort has been made to develop approximate offline algorithms to solve ever larger POMDPs. However, even stateof-the-art approaches fail to solve large POMDPs in reasonable time. Recent developments in online POMDP search suggest that combining offline computations with online computations is often more efficient and can also considerably reduce the error made by approximate policies computed offline. In the same vein, we propose a new anytime online search algorithm which seeks to minimize, as efficiently as possible, the error made by an approximate value function computed offline. In addition, we show how previous online computations can be reused in following time steps in order to prevent redundant computations. Our preliminary results indicate that our approach is able to tackle large state space and observation space efficiently and under real-time constraints. 1
Efficient Planning under Uncertainty with Macro-actions
, 2014
"... Terms of Use Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. Detailed Terms The MIT Faculty has made this article openly available. Please share how this access benefits you. ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
(Show Context)
Terms of Use Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. Detailed Terms The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters.
Interactive Learning of Independent Experts ’ Criteria for Rescue Simulations
"... Abstract: Efficient response to natural disasters has an increasingly important role in limiting the toll on human life and property. The work we have undertaken seeks to improve existing models by building a Decision Support System (DSS) of resource allocation and planning for natural disaster emer ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
Abstract: Efficient response to natural disasters has an increasingly important role in limiting the toll on human life and property. The work we have undertaken seeks to improve existing models by building a Decision Support System (DSS) of resource allocation and planning for natural disaster emergencies in urban areas. A multi-agent environment is used to simulate disaster response activities, taking into account geospatial, temporal and rescue organizational information. The problem we address is the acquisition of situated expert knowledge that is used to organize rescue missions. We propose an approach based on participatory design and interactive learning which incrementally elicits experts ’ preferences by online analysis of their interventions with rescue simulations. An additive utility functions are used, assuming mutual preferential independence between decision criteria, as a preference for the elicitation process. The learning algorithm proposed refines the coefficients of the utility function by resolving incremental linear programming. For testing our algorithm, we run rescue scenarios of ambulances saving victims. This experiment makes use of geographical data for the Ba-Dinh district of Hanoi and damage parameters from well-regarded local statistical and geographical resources. The preliminary results show that our approach is initially confident in solving this
A.: Improving the efficiency of online POMDPs by using belief similarity measures
- In: Proc. International Conference on Robotics and Automation, ICRA
, 2013
"... Abstract — In this paper, we introduce an approach called FSBS (Forward Search in Belief Space) for online planning in POMDPs. The approach is based on the RTBSS (Real-Time Belief Space Search) algorithm of [1]. The main departure from the algorithm is the introduction of similarity measures in the ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
(Show Context)
Abstract — In this paper, we introduce an approach called FSBS (Forward Search in Belief Space) for online planning in POMDPs. The approach is based on the RTBSS (Real-Time Belief Space Search) algorithm of [1]. The main departure from the algorithm is the introduction of similarity measures in the belief space. By considering statistical divergence measures, the similarity between belief points in the forward search tree can be computed. Therefore, it is possible to determine if a certain belief point (or one very similar) has been already visited. This way, it is possible to reduce the complexity of the search by not expanding similar nodes already visited in the same depth. This reduction of complexity makes possible the real-time implementation of more complex problems in robots. The paper describes the algorithm, and analyzes different divergence measures. Benchmark problems are used to show how the approach can obtain a ten-fold reduction in the computation time for similar obtained rewards when compared to the original RTBSS. The paper also presents experiments with a quadrotor in a search application. I.
Journalof Artificial Intelligence Research 40(2011)523-570 Submitted 9/10;published2/11 EfficientPlanning underUncertaintywith Macro-actions
"... Deciding how to act in partially observable environments remains an active area of research. Identifyinggoodsequencesofdecisionsisparticularlychallengingwhengoodcontrolperformance requires planning multiple steps into the future in domains with many states. Towards addressing this challenge, we pres ..."
Abstract
- Add to MetaCart
(Show Context)
Deciding how to act in partially observable environments remains an active area of research. Identifyinggoodsequencesofdecisionsisparticularlychallengingwhengoodcontrolperformance requires planning multiple steps into the future in domains with many states. Towards addressing this challenge, we present an online, forward-search algorithm called the Posterior Belief Distribution (PBD). PBD leverages a novel method for calculating the posterior distribution over beliefs that result afterasequence of actions istaken, given theset of observation sequences that could be receivedduringthisprocess. Thismethodallowsustoefficientlyevaluatetheexpectedrewardofa sequenceofprimitiveactions,whichwerefertoasmacro-actions. Wepresentaformalanalysisof ourapproach,andexamineitsperformanceontwoverylargesimulationexperiments: scientificexplorationandatargetmonitoringdomain. Wealsodemonstrateouralgorithmbeingusedtocontrol a real robotic helicopter in a target monitoring experiment, which suggests that our approach has practicalpotentialforplanninginreal-world,largepartiallyobservabledomainswhereamulti-step lookahead isrequired toachieve good performance.
Online Policy Improvement in Large POMDPs via an Error Minimization Search
"... Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical framework for planning under uncertainty. However, most real world systems are modelled by huge POMDPs that cannot be solved due to their high complexity. To palliate to this difficulty, we propose combining existing ..."
Abstract
- Add to MetaCart
(Show Context)
Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical framework for planning under uncertainty. However, most real world systems are modelled by huge POMDPs that cannot be solved due to their high complexity. To palliate to this difficulty, we propose combining existing offline approaches with an online search process, called AEMS, that can improve locally an approximate policy computed offline, by reducing its error and providing better performance guarantees. We propose different heuristics to guide this search process, and provide theoretical guarantees on the convergence to ǫ-optimal solutions. Our experimental results show that our approach can provide better solution quality within a smaller overall time than state-of-the-art algorithms and allow for interesting online/offline computation tradeoff. 1.