Results 11 - 20
of
98
Explanation-Based Learning and Reinforcement Learning: A Unified View
- Machine Learning
, 1995
"... In speedup-learning problems, where full descriptions of operators are always known, both explanation-based learning (EBL) and reinforcement learning (RL) can be applied. This paper shows that both methods involve fundamentally the same process of propagating information backward from the goal towar ..."
Abstract
-
Cited by 42 (2 self)
- Add to MetaCart
In speedup-learning problems, where full descriptions of operators are always known, both explanation-based learning (EBL) and reinforcement learning (RL) can be applied. This paper shows that both methods involve fundamentally the same process of propagating information backward from the goal toward the starting state. RL performs this propagation on a state-by-state basis, while EBL computes the weakest preconditions of operators, and hence, performs this propagation on a region-by-region basis. Based on the observation that RL is a form of asynchronous dynamic programming, this paper shows how to develop a dynamic programming version of EBL, which we call Explanation-Based Reinforcement Learning (EBRL). The paper compares batch and online versions of EBRL to batch and online versions of RL and to standard EBL. The results show that EBRL combines the strengths of EBL (fast learning and the ability to scale to large state spaces) with the strengths of RL (learning of optimal policies)...
Threat-removal strategies for partial-order planning
- In Proceedings of the Eleventh National Conference on Artificial Intelligence
, 1993
"... McAllester and Rosenblitts ’ (1991) systematic nonlinear planner (SNLP) removes threats as they are discovered. In other planners such as SIPE (Wilkins, 1988), and NOAH (Sacerdoti, 1977), threat resolution is partially or completely delayed. In this paper, we demonstrate that planner efficiency may ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
McAllester and Rosenblitts ’ (1991) systematic nonlinear planner (SNLP) removes threats as they are discovered. In other planners such as SIPE (Wilkins, 1988), and NOAH (Sacerdoti, 1977), threat resolution is partially or completely delayed. In this paper, we demonstrate that planner efficiency may be vastly improved by the use of alternatives to these threat removal strategies. We discuss five threat removal strategies and prove that two of these strategies dominate the other three--resulting in a provably smaller search space. Furthermore, the systematicity of the planning algorithm is preserved for each of the threat removal strategies. Finally, we confirm our results experimentally using a large number of planning examples including examples from the literature. 1
Continuous Case-Based Reasoning
, 1996
"... Case-based reasoning systems have traditionally been used to perform high-level reasoning in problem domains that can be adequately described using discrete, symbolic representations. However, many real-world problem domains, such as autonomous robotic navigation, are better characterized using cont ..."
Abstract
-
Cited by 40 (5 self)
- Add to MetaCart
Case-based reasoning systems have traditionally been used to perform high-level reasoning in problem domains that can be adequately described using discrete, symbolic representations. However, many real-world problem domains, such as autonomous robotic navigation, are better characterized using continuous representations. Such problem domains also require continuous performance, such as online sensorimotor interaction with the environment, and continuous adaptation and learning during the performance task. This article introduces a new method for continuous case-based reasoning, and discusses its application to the dynamic selection, modification, and acquisition of robot behaviors in an autonomous navigation system, SINS (Self-Improving Navigation System). The computer program and the underlying method are systematically evaluated through statistical analysis of results from several empirical studies. The article concludes with a general discussion of case-based reasoning issues addr...
Learning Approximate Control Rules Of High Utility
- In Proceedings of the Seventh International Conference on Machine Learning
, 1990
"... One of the difficult problems in the area of explanation based learning is the utility problem; learning too many rules of low utility can lead to swamping, or degradation of performance. This paper introduces two new techniques for improving the utility of learned rules. The first technique is to c ..."
Abstract
-
Cited by 40 (1 self)
- Add to MetaCart
One of the difficult problems in the area of explanation based learning is the utility problem; learning too many rules of low utility can lead to swamping, or degradation of performance. This paper introduces two new techniques for improving the utility of learned rules. The first technique is to combine EBL with inductive learning techniques to learn a better set of control rules; the second technique is to use these inductive techniques to learn approximate control rules. The two techniques are synthesized in an algorithm called approximating abductive explanation based learning (AxA-EBL). AxAEBL is shown to improve substantially over standard EBL in several domains. 1 Introduction One of the difficult problems in the area of explanation based learning is the utility problem. The utility of a rule is its contribution to performance improvement; the utility is directly proportional to the coverage of a rule and inversely proportional to the match cost of a rule, where coverage is de...
Acquiring Recursive and Iterative Concepts with Explanation-Based Learning
- Machine Learning
, 1989
"... In explanation-based learning, a specific problem's solution is generalized into a form that can be later used to solve conceptually similar problems. Most research in explanation-based learning involves relaxing constraints on the variables in the explanation of a specific example, rather than gene ..."
Abstract
-
Cited by 39 (1 self)
- Add to MetaCart
In explanation-based learning, a specific problem's solution is generalized into a form that can be later used to solve conceptually similar problems. Most research in explanation-based learning involves relaxing constraints on the variables in the explanation of a specific example, rather than generalizing the graphical structure of the explanation itself. However, this precludes the acquisition of concepts where an iterative or recursive process is implicitly represented in the explanation by a fixed number of applications. This paper presents an algorithm that generalizes explanation structures and reports empirical results that demonstrate the value of acquiring recursive and iterative concepts. The BAGGER2 algorithm learns recursive and iterative concepts, integrates results from multiple examples, and extracts useful subconcepts during generalization. On problems where learning a recursive rule is not appropriate, the system produces the same result as standard explanation-based ...
The Use of Explicit Goals for Knowledge to Guide Inference and Learning
- APPLIED INTELLIGENCE
, 1992
"... Combinatorial explosion of inferences has always been a central problem in artificial intelligence. Although the inferences that can be drawn from a reasoner's knowledge and from available inputs is very large (potentially infinite), the inferential resources available to any reasoning system are ..."
Abstract
-
Cited by 36 (21 self)
- Add to MetaCart
Combinatorial explosion of inferences has always been a central problem in artificial intelligence. Although the inferences that can be drawn from a reasoner's knowledge and from available inputs is very large (potentially infinite), the inferential resources available to any reasoning system are limited. With limited inferential capacity and very many potential inferences, reasoners must somehow control the process of inference. Not all inferences are equally useful to a given reasoning system. Any reasoning system that has goals (or any form of a utility function) and acts based on its beliefs indirectly assigns utility to its beliefs. Given limits on the process of inference, and variation in the utility of inferences, it is clear that a reasoner ought to draw the inferences that will be most valuable to it. This paper presents an approach to this problem that makes the utility of a (potential) belief an explicit part of the inference process. The method is to generate exp...
Learning Explanation-Based Search Control Rules for Partial Order Planning
, 1994
"... This paper presents the first implementation of explanation based learning techniques for a partial order planner. We describe the basic learning framework of snlp+ebl, including regression, explanation propagation and rule generation. We then concentrate on snlp+ebl's ability to learn from ..."
Abstract
-
Cited by 32 (1 self)
- Add to MetaCart
This paper presents the first implementation of explanation based learning techniques for a partial order planner. We describe the basic learning framework of snlp+ebl, including regression, explanation propagation and rule generation. We then concentrate on snlp+ebl's ability to learn from failures and present a novel approach that uses stronger domain and planner specific consistency checks to detect, explain and learn from the failures of plans at depth limits. We will end with an empirical evaluation of the efficacy of this approach in improving planning performance.
Nonlinear Planning with Parallel Resource Allocation
- In Proceedings of the DARPA Workshop on Innovative Approaches to Planning, Scheduling, and Control
, 1990
"... Most nonlinear problem solvers use a leastcommitment search strategy, reasoning about partially ordered plans. Although partial orders are useful for exploiting parallelism in execution, leastcommitment is NP-hard for complex domain descriptions with conditional effects. Instead, a casual-comm ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
Most nonlinear problem solvers use a leastcommitment search strategy, reasoning about partially ordered plans. Although partial orders are useful for exploiting parallelism in execution, leastcommitment is NP-hard for complex domain descriptions with conditional effects. Instead, a casual-commitment strategy is developed, as a natural framework to reason and learn about control decisions in planning. This paper describes how NOLIMIT reasons about totally ordered plans using a casual-commitment strategy, how it generates a partially ordered solution from a totally ordered one by analyzing the dependencies among the plan steps, and finallyhow resources are allocated by exploiting the parallelism embedded in the partial order. We illustrate our claims with the implemented algorithms and several examples. This work has been done in the context of the PRODIGY architecture that incorporates NOLIMIT, a nonlinear problem solver.
Failure Driven Dynamic Search Control for Partial Order Planners: An Explanation based approach
- ARTIFICIAL INTELLIGENCE
, 1996
"... Given the intractability of domain-independent planning, the ability to control the search of a planner is vitally important. One way of doing this involves learning from search failures. This paper describes SNLP+EBL, the first implementation of explanation based search control rule learning framew ..."
Abstract
-
Cited by 30 (11 self)
- Add to MetaCart
Given the intractability of domain-independent planning, the ability to control the search of a planner is vitally important. One way of doing this involves learning from search failures. This paper describes SNLP+EBL, the first implementation of explanation based search control rule learning framework for a partial order (plan-space) planner. We will start by describing the basic learning framework of SNLP+EBL. We will then concentrate on SNLP+EBL's ability to learn from failures, and describe the results of empirical studies which demonstrate the effectiveness of the search-control rules SNLP+EBL learns using our method. We then
Planning by Rewriting
- Journal of Artificial Intelligence Research
, 2001
"... Domain-independent planning is a hard combinatorial problem. Taking into account plan quality makes the task even more difficult. This article introduces Planning by Rewriting (PbR), a new paradigm for efficient high-quality domain-independent planning. PbR exploits declarative plan-rewriting rules ..."
Abstract
-
Cited by 28 (4 self)
- Add to MetaCart
Domain-independent planning is a hard combinatorial problem. Taking into account plan quality makes the task even more difficult. This article introduces Planning by Rewriting (PbR), a new paradigm for efficient high-quality domain-independent planning. PbR exploits declarative plan-rewriting rules and efficient local search techniques to transform an easy-to-generate, but possibly suboptimal, initial plan into a high-quality plan. In addition to addressing the issues of planning efficiency and plan quality, this framework offers a new anytime planning algorithm. We have implemented this planner and applied it to several existing domains. The experimental results show that the PbR approach provides significant savings in planning effort while generating high-quality plans.

