| Jonathan Gratch and Gerald DeJong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 235--240, 1992. |
....lengths to validate it, but it is worth notice in passing. Much previous work on the expensive rule problem has investigated how to improve the utility of EBL through re structuring and#or #ltering learned rules based on experimentation with those rules #Minton, 1988; Greiner Jurisica, 1992; Gratch Dejong, 1992; Markovitch Scott, 1993#. Heuristic approaches to generating learned rules have also been proposed that provide improved e#ciency over straightforward EBL such as #Prieditis Mostow, 1987; Minton, 1988; Shell Carbonell, 1991; Shavlik, 1990; Etzioni, 1990#. However, none of these ....
....transformations that allow increased utility; for example, PRODIGY#EBL s utilityevaluation #Minton, 1988#, where it measures the utility in terms of the savings and cost of a rule, and rules are deactivated if their utility is estimated as negative. PALO #Greiner Jurisica, 1992# and Composer #Gratch Dejong, 1992# navigate through the space of performance elements, and select rules only if they show incremental utility. The information #ltering model #Markovitch Scott, 1993# proposes a more general framework for selective learning, and de#nes various methods for eliminating harmful knowledge from the ....
Gratch, J. & Dejong, G. #1992#. COMPOSER: A probabilistic solution to the utility prob41 lem in speed-up learning. In Proceedings of the Tenth National ConferenceonAriti#cial Intelligence, pages 235#240.
.... paper focuses on parametric ranking problems, a general class of statistical machine learning problems in which the goal is to rank a set of alternative hypotheses where the goodness of a hypothesis is a function of a set of parameters whose values are unknown (e.g. Chien, Stechert, Mutz, 1998; Gratch, 1992; Greiner Jurisica, 1992; Kaelbling, 1993; Moore Lee, 1994; Musick et al. 1993) The learning system determines and re nes estimates of these parameters by using training examples, with a secondary goal of minimizing learning cost. The principal contributions of this paper are: We de ne ....
....must select and rank the top M out of N hypotheses. 376 Efficient Heuristic Hypothesis Ranking problem where a single problem solving heuristic or strategy is chosen from a larger set of candidates. In this case, the expected utility is typically de ned as the average time to solve a problem (Gratch, 1992; Greiner Jurisica, 1992; Minton, 1988) The attribute selection problem in machine learning can also be viewed as a hypothesis selection problem in which one must select the best attribute split from a set of possible attribute splits and utility is often measured by information gain (Musick et ....
Gratch, J. (1992). COMPOSER: A Probabilistic Solution to the Utility Problem in Speed-up Learning. In Proceedings of the Tenth National Conference on Articial Intelligence, pp. 235-240 San Jose, CA. AAAI.
.... component after some desired or peak performance level has been reached [16] learning only certain types of rules (non recursive) that are expected to have low match cost [12] and employing statistical approaches to ensure that only rules that improve performance are added to the knowledge base [14,15]. Unfortunately, this approach alone is inadequate because it enforces the system to learn only a few number of rules and reduces the gain of learning. However, it can be complemented by another approach to reducing match cost, enabling the system to learn more rules before reaching its maximum. ....
- J. Gratch, G. Dejong. COMPOSER : A probabilistic solution to the utility problem in speed-up learning, AAAI-92, pp 235-240, 1992.
....problems (e.g. by pruning the search space) but if the slowdown in the matcher increases the time per step, then this can outweigh the reduction in the number of steps. This has been observed in several machine learning systems (Minton, 1988; Etzioni, 1990a; Tambe et al. 1990; Cohen, 1990; Gratch and DeJong, 1992). To avoid this slowdown, previous research on the utility problem from match cost has taken three general approaches. One approach is simply to reduce the number of rules in the system s knowledge base, by being selective about when to learn or which rules or types of rules to learn, or by ....
.... the number of rules in the system s knowledge base, by being selective about when to learn or which rules or types of rules to learn, or by forgetting previously learned rules if they slow down the matcher enough to cause an overall system slowdown (Minton, 1988; Etzioni, 1990b; Holder, 1992; Gratch and DeJong, 1992; Greiner and Jurisica, 1992; Markovitch and Scott, 1993) Unfortunately, this approach is inadequate for the long term goals of AI because, given the current state of match technology, it precludes learning a vast amount of knowledge. Moreover, it is intuitively desirable to have AI systems that ....
Gratch, J. and DeJong, G. (1992). COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 235--240.
....to be an anytime system, as described in section 3. Since we do not make any fundamental changes to the planner, we are able to take advantage of the body of work that has been done with classical planners, such as the use of abstraction [18] machine learning to improve planning performance [21, 11, 15] and derivational analogy [27, 16] To give a feel for the type of behavior we have been able to get from our architecture, in section 4 we provide two traces of the system controlling a simulated household robot built in the Oz system [1] In section 5 we present the results of some experiments ....
John Gratch and Gerry DeJong. Composer: A probabilistic solution to the utility problem in speed-up learning. In AAAI 92, 1992.
.... the utility problem in speedup learning using empirical evaluations of control rules met with limited success due to a limited understanding of the problem solver s behavior [EM92] More recent work applies statistical measures to learn control rules for which there is a high certainty of utility [GD92, GJ92]. However, these approaches require a large number of training problems to estimate the distribution and ensure utile control rules. Preliminary results in [Hol92a, Hol92b] and more recent results reported here suggest that a simple intermediate approach may yield sufficient speedup with fewer ....
....knowledge based on the explanation s probability of correctness, expected cost to match the search control rule, and expected degradation in solution quality. Several examples are needed to support an explanation with high confidence and adopt the corresponding control rules. The Composer system [GD92] embodies a probabilistic solution to the utility problem. Composer defines the utility of a planner as the sum of the utility of each problem in the distribution weighted by its probability of occurrence. A candidate control rule is evaluated in the context of the existing planner. If there is ....
J. Gratch and G. DeJong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 235--240, 1992.
....to another. Most systems do this implicitly, and heuristically. By contrast, our palo system performs an explicit statistical test to determine, with prescribed confidence, when one element is superior to another. As such, it is very similar to the composer system of Gratch, deJong and Chien [28, 29, 27]. composer differs from palo in two significant ways. First, composer will use all available samples when hill climbing. By contrast, palo will stop and return the currentlybest performance element Theta if none of Theta s neighbors appears significantly better than Theta which means palo ....
J. Gratch and G. Dejong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of AAAI-92, 1992. PALO: A Probabilistic Hill-Climbing Algorithm 30
....in highly recursive problem spaces. The heuristics are attractive because they constitute an a priori basis for generating effective control knowledge. A number of researchers have developed post hoc mechanisms for evaluating control knowledge by measuring its effectiveness on a sample of problems [19, 20, 32, 54]. However, as argued in Section 5, these post hoc mechanisms are heuristic as well. In addition, generating the large samples typically required by these mechanisms is very costly. Experimental Validation When applied to prodigy, the structural theory helps explain prodigy ebl s success in ....
....of each block. Determining whether one block is above another, while recursive in the problem space, merely requires comparing Xcoordinates in the domain theory. 5 CHOOSING WHAT PROOFS TO LEARN FROM 20 evaluating control knowledge by measuring its effectiveness on a sample of problems [19, 20, 32, 54]. The advantage of post hoc approaches is that they consider problem distribution, whereas a priori approaches do not. Sample based approaches have several disadvantages, though. First, the complex interactions between different control rules mean that individual control rules cannot be tested ....
[Article contains additional citation context not shown here]
Jonathan Gratch and Gerald Dejong. Composer: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of AAAI-92, 1992.
....amount than is saved during planning through compiled macro operators. This is an example of the utility problem [Minton, 1990] The extra rules may not be used sufficiently often to produce an overall performance improvement. Considerable research has been directed at examining this problem (e.g. [Gratch and DeJong, 1992; Wray et al. 1996; Minton, 1996] In this research project we have not directly examined this trade off. However, on certain problems, the Soar matcher has been shown to maintain a constant speed as up to 100,000 new rules are learned [Doorenbos, 1993] This result suggests that knowledge ....
Jonathan Gratch and Gerald DeJong. Composer: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 235--240, 1992.
....laborious task. Research in learning and planning attempts to address this important problem by developing methods that automatically acquire search control knowledge from experience. However, most work in learning and planning has been in the context of linear, state based planners (Minton, 1989; Gratch DeJong, 1992; Leckie Zuckerman, 1993) More recently, the problem of learning search control for a nonlinear planner has been introduced (Katukam Kambhampati, 1994; Borrajo Veloso, 1994b) Nonlinear planners have been generally accepted as superior to linear This research was supported by the NASA ....
Gratch, J., & DeJong, G. (1992). COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 235--240 San Jose, CA.
....focus on the expensive chunk problem. Previous work on the expensive chunk problem has investigated how to produce cheaper rules (Prieditis Mostow 1987; Minton 1988; Shell Carbonell 1991; Shavlik 1990; Etzioni 1990) and how to filter out expensive rules (Minton 1988; Greiner Jurisica 1992; Gratch Dejong 1992; Markovitch Scott 1993) However, none of these approaches can generally guarantee that the cost of using the learned rules will always be bounded by the cost of the problem solving episode from which they are learned. That is, the cost of a learned rule can be greater than the cost of solving ....
Gratch, J., and Dejong, G. 1992. COMPOSER: A probabilistic solution to the utility problem in speed-up learning.
....well to new planning situations. Scope extends previous planning and learning research by acquiring control knowledge for the newer, more efficient partial order planners. Most work in learning control rules for planning has been in the context of linear, state based planners (Minton, 1989; Gratch DeJong, 1992; Leckie Zuckerman, 1993) These planners are usually classified as total order planners, since plans steps are maintained in a strictly ordered list. Recent experimental results, however, support that partial order planners are more efficient than total order planners in most domains ....
....in the final planner. Thus, we would like to incorporate a method into Scope that directly evaluates control rule utility. Researchers have introduced a variety of techniques for determining the best rules to save (Greiner Likuski, 1989; Markovitch Scott, 1989; Subramanian Feldman, 1990; Gratch DeJong, 1992). As yet, no one has applied such techniques to evalutate rules for a partialorder planner, however, we feel that such a method could be easily integrated into our learning system. Another possible improvement is to replace or modify the standard Foil informationgain heuristic currently used by ....
Gratch, J., & DeJong, G. (1992). COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 235--240 San Jose, CA.
....Intelligence (www.aaai.org) All rights reserved. rules whose main associated cost is the time it takes to match their preconditions. Most of the existing solutions for this problem involve filtering out control rules that are estimated to be of low utility (Minton 1988; Markovitch Scott 1993; Gratch DeJong 1992). Others try to restrict the complexity of the preconditions (Tambe, Newell, Rosenbloom 1990) In this work we deal with a different setup where the control procedure has potentially very high complexity regardless of the specific control knowledge acquired. In this setup, the utility problem ....
Gratch, J., and DeJong, D. 1992. COMPOSER: A probabilistic solution to the utility problem in speedup learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, 235--240. San Jose, California: American Association for Artificial Intelligence.
....utility . This position paper, however, advocates a more cautious approach: Only climb to a new element (e.g. modify the derivation strategy, or add a new macro) if we are confident that the resulting element is better than the current one. Such cautious adaptive systems (e.g. composer [GD92] and palo [Gre92, GJ92] first observe a statistically significant set of queries, implicitly computing the empirical expected costs of an element with, versus without, a proposed modification. They then climb to the modified element if it is, with high probability, superior to the original one. ....
Jonathan Gratch and Gerald Dejong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of AAAI-92, 1992.
....shown that it is possible to learn over one million rules while still allowing their efficient use [14, 15] This research focuses on the expensive chunk problem in EBL. There have been approaches which are useful for producing cheaper rules [16, 3, 12, 17, 11, 10] or filtering out expensive rules [3, 18, 19, 20]. However, these approaches cannot generally guarantee that the cost of using the learned rules will always be bounded by the cost of the planning episode from which they are learned. That is, the cost of a learned rule can be greater than the cost of planning with the original set of rules. One ....
J. Gratch and G. Dejong. Composer: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Aritificial Intelligence, pages 235--240, 1992.
....overall utility of the multiagent system, or no improvement greater than some threshold of significance. The constraint added is the one whose addition causes the greatest increase in average solution utility over all problems in the training set. This approach recalls that of Gratch and DeJong [GD92], whose work uses a very similar method (with a statistical measure of quality for each item to be learned) to learn search control rules for planning. The performance measure, or utility, must be provided; it might account for such things as planning time, the cost in time and resources needed ....
Jonathan Gratch and Gerald DeJong. COMPOSER: a probabilistic solution to the utility problem in speed-up learning. In Proceedings of AAAI-92, pages 235--240. AAAI, 1992.
.... learning using empirical evaluations of control rules met with limited success due to a limited understanding of the problem solver s behavior [ Etzioni and Minton, 1992 ] More recent work applies statistical measures to learn control rules for which there is a high certainty of utility [ Gratch and DeJong, 1992; Greiner and Jurisica, 1992 ] However, these approaches require a large number of training problems to estimate the distribution and ensure utile control rules. Preliminary results in [ Holder, 1992a; Holder, 1992b ] and more recent results reported here suggest that a simple intermediate ....
....knowledge based on the explanation s probability of correctness, expected cost to match search control rule, and expected degradation in solution quality. Several examples are needed to support an explanation with high confidence and adopt the corresponding control rules. The Composer system [ Gratch and DeJong, 1992 ] embodies a probabilistic solution to the utility problem. Composer defines the utility of a planner as the sum of the utility of each problem in the distribution weighted by its probability of occurrence. A candidate control rule is evaluated in the context of the existing planner. If there ....
Gratch, J. and DeJong, G. (1992). COMPOSER: A probabilistic solution to the utility problem in speedup learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, 235--240.
....hundred runs, with various settings and graphs, we have found that palo 0 s error rate is considerably under this rate. We are now experimenting with variants of palo 0 that are less conservative in their estimates, in the hope that they will be correspondingly less sample hungry. See also [GD92] Finally, while this paper has focused on but a single set of proposed transformations T RO , there are many other transformation sets T that can also be used to find an efficient satisficing system; e.g. Gre92a] discusses a set of transformations that correspond to operator ....
....as it can prevent a learning algorithm from modifying, and therefore possibly degrading, an initial PE that happens to already be optimal. The correct action for the learner to take for such initial PEs is simply to leave them unmodified i.e. not to learn. The work reported in [GD91, GD92] is perhaps the most similar to ours, in that their system also uses statistical technique to guarantee that the learned control strategy will be an improvement, based on a utility analysis. We extend those results by formally proving specific bounds on the sample complexity, and by providing a ....
J. Gratch and G. DeJong. COMPOSER: A probabilistic solution to the utility problem in speedup learning. In Proceedings of AAAI-92, 1992.
....overall utility of the multiagent system, or no improvement greater than some threshold of significance. The constraint added is the one whose addition causes the greatest increase in average solution utility over all problems in the training set. This approach recalls that of Gratch and DeJong [GD92], whose work uses a very similar method (with a statistical measure of quality for each item to be learned) to learn search control rules for planning. The performance measure, or utility, must be provided; it might account for such things as planning time, the cost in time and resources needed ....
Jonathan Gratch and Gerald DeJong. COMPOSER: a probabilistic solution to the utility problem in speed-up learning. In Proceedings of AAAI-92, pages 235--240. AAAI, 1992.
....affected by the usefulness of the completable plan containing it. The method we chose by which to address the utility problem i.e. learning useful contingent and completable plans is based on gathering statistics on the plans usefulness based on a given set of utility criteria, as in [Gratch92, Minton88], after which non utile plans can be discarded. 4 EXPERIMENTS To test our learning approach to completable planning, we ran the system on several experiments in a simulated robot navigation domain. All the experiments involved a simulation of a robot in a rectangular room. The directed point ....
J. GratchandG.DeJong,"COMPOSER:A Probabilistic Solution to the Utility Problem in Speed--up Learning," Tenth National Conference on Artificial Intelligence, San Jose, CA, 1992.
....associated with using macros is the increased branching factor of the search space. When c fl1998 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved. Finkelstein Markovitch the costs outweigh the benefits, we face a phenomenon called the utility problem (Minton, 1988; Gratch DeJong, 1992; Markovitch Scott, 1993; Mooney, 1989) Due to the very large number of macros available for acquisition, a learning program must be selective in order to obtain a macro set with high utility. The goal of this research is to demonstrate that a simple macro learning technique, combined with the ....
....factor. The peg solitaire domain cannot be handled efficiently by Micro Hillary since it does not use a small set of fixed operators. MacLearn solves it by using parameterized operators. Unlike some speedup learners that provide us with either statistical or theoretical guarantees (Cohen, 1992; Gratch DeJong, 1992; Greiner Likuski, 1989; Subramanian Hunter, 1992; Tadepalli Natarajan, 1996) Micro Hillary has a heuristic nature and does not provide us with any guarantee. Indeed, while it performs very well in some domains, it fails in other domains such as the N Hanoi. To handle such domains, we would ....
Gratch, J., & DeJong, D. (1992). COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 235--240. San Jose, California. Morgan Kaufmann.
....of performance due to increasing amounts of learned knowledge. Our approach to solving the utility problem in speedup learning requires few training examples (low learning time) and does not use utility measures. This is in contrast to other approaches which apply empirical [2] and statistical [4, 5] measures to learn control rules for which there is high certainty of utility. However, these approaches require a large number of training examples to estimate the problem distribution (implying higher learning time) and ensure utile control rules. This work also tries to identify characteristics ....
....utility problem in speedup learning rely on training examples to empirically evaluate the utility of learned knowledge. Minton s Prodigy system [12] utilizes a utility function that evaluates control knowledge based on application cost, frequency of use and average savings. PALO [5] and Composer [4] use statistical measures to evaluate control knowledge. Several examples are needed to support an explanation with high confidence and adopt the corresponding control rules. 3.1 Prodigy The Prodigy system [12] evaluates the utility of problem solving control knowledge by estimating the ....
[Article contains additional citation context not shown here]
J. Gratch and G. DeJong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 235--240, 1992.
....hundred runs, with various settings and graphs, we have found that palo 0 s error rate is considerably under this rate. We are now experimenting with variants of palo 0 that are less conservative in their estimates, in the hope that they will be correspondingly less sample hungry. See also [GD92] Finally, while this paper has focused on but a single set of proposed transformations T RO , there are many other transformation sets T that can also be used to find an efficient satisficing system; e.g. Gre92a] discusses a set of transformations that correspond to operator ....
....as it can prevent a learning algorithm from modifying, and therefore possibly degrading, an initial element that happens to already be optimal. The correct action for the learner to take for such initial PEs is simply to leave them unmodified i.e. not to learn. The work reported in [GD91, GD92] is perhaps the most similar to ours, in that their system also uses a statistical technique to guarantee that the learned control strategy will be an improvement, based on a utility analysis. Our work differs, as we formally prove specific bounds on the sample complexity, and provide a learning ....
J. Gratch and G. DeJong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of AAAI-92, 1992.
....is possible to learn over one million rules while still allowing their efficient use [10, 11] In this article we focus on the expensive chunk problem. Previous work on the expensive chunk problem has investigated how to produce cheaper rules [12, 5, 8, 13, 7] and how to filter out expensive rules [5, 14, 15, 16]. However, none of these approaches can generally guarantee that the cost of using the learned rules will always be bounded by the cost of the problem solving episode from which they are learned. That is, the cost of a learned rule can be greater than the cost of solving the problem with the ....
J. Gratch and G. Dejong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Aritificial Intelligence, pages 235--240, 1992.
....a new performance element which is strictly better than the current performance element to reach the one that is essentially local optimal. The utility analysis in PALO is expected performance based on the test cases from a fixed distribution. This is similar to the approach taken in Composer [11] which adds a control rule to the system only if it shows incremental utility. The incremental utility is evaluated by expected problem solving cost in a sequence of problems. The information filtering model [19] proposes a more general framework of selective learning, and defines various methods ....
J. Gratch and G. Dejong. Composer: A probabilistic solution to the utility problem in speedup learning. In Proceedings of the Tenth National Conference on Aritificial Intelligence, pages 235--240, 1992.
....to another. Most systems do this implicitly, and heuristically. By contrast, our palo system performs an explicit statistical test to determine, with prescribed confidence, when one element is superior to another. As such, it is very similar to the composer system of Gratch, deJong and Chien [28, 29, 27]. composer differs from palo in two significant ways. First, composer will use all available samples when hill climbing. By contrast, palo will stop and return the currentlybest performance element Theta if none of Theta s neighbors appears significantly better than Theta which means palo ....
J. Gratch and G. Dejong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of AAAI-92, 1992.
....portion of a complete plan that will allow the agent to take some actions. This has been done such that the planner is changed only minimally, allowing us to make use of the large body of work on classical planning systems, such as abstraction [12] machine learning to improve planning performance [14, 7, 10] and derivational analogy [16, 11] More details about the anytime planner can be found in [3] 2.2. Integration Hap is designed to react quickly and intelligently in a dynamic environment by using stored behaviors when possible. Prodigy is designed to plan for sets of goals that may interact, and ....
John Gratch and Gerry DeJong. Composer: A probabilistic solution to the utility problem in speed-up learning. In AAAI 92, 1992.
....of run time after learning being greater than run time before learning. This utility problem has been a particular focus of research in explanation based learning (EBL) There have been approaches which are useful for producing cheaper rules [1, 2, 3, 4, 5, 6] or filtering out expensive rules [2, 7, 8, 9]. However, these approaches cannot generally guarantee that the cost of using the learned rules will always be bounded by the cost of the problem solving from which they are learned, given the same situation. One way of finding a solution which can guarantee such cost boundness is to analyze all ....
J. Gratch and G. Dejong. Composer: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Aritificial Intelligence, pages 235--240, 1992.
....compiled rules can have a negative as well as a positive effect on efficiency is generally referred to as the utility problem. A number of techniques have been subsequently developed to help insure the utility of EBL, including simplifying and selectively retaining and utilizing learned rules [22, 26, 16]. As discussed in section 4, combining EBL with ILP is another important approach to addressing the utility problem. Another well recognized problem with traditional EBL is the requirement that the domain theory be correct and complete [25] A number of researchers have combined EBL with various ....
J. Gratch and G. DeJong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 235--240, San Jose, CA, July 1992.
....very few examples. Thus, it might be useful to incorporate a method into Scope that directly evaluates control rule utility. Researchers have introduced a variety of techniques for determining the best rules to save (Greiner Likuski, 1989; Markovitch Scott, 1989; Subramanian Feldman, 1990; Gratch DeJong, 1992). As yet, no one has applied such techniques to evaluate rules for a partial order planner, however, such a method should be easy to integrate into Scope s learning system. 10.1.4 Induction Bias Another possible improvement is to replace or modify the standard Foil information gain heuristic ....
Gratch, J., & DeJong, G. (1992). COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pp. 235--240 San Jose, CA.
....is encountered, static will be significantly slower than prodigy ebl. Second, static will fail when the analysis required to curtail problem solving falls outside its scope. In addition, when proving both failure and success is recursive as in the Tower of Hanoi, ABworld, or recursive Bin world [19] problem spaces, static will form trivial PSGs and refuse to generate much if any control knowledge. To its credit, static will do so quickly, whereas EBL systems will often generate ineffective control knowledge in such spaces, and take a long time to do so [13, 19, 23] As EBL lore would have ....
....ABworld, or recursive Bin world [19] problem spaces, static will form trivial PSGs and refuse to generate much if any control knowledge. To its credit, static will do so quickly, whereas EBL systems will often generate ineffective control knowledge in such spaces, and take a long time to do so [13, 19, 23]. As EBL lore would have it, PE is bound to be intractable when applied to sufficiently large and complex theories. The analysis in this section shows that when PE is appropriately constrained, as in static, PE will only be intractable, relative to EBL, when potential nonrecursive subgoaling in ....
[Article contains additional citation context not shown here]
J. Gratch and G. Dejong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of AAAI-92. AAAI Press, 1992.
....on the problem solver. Each transformation is only made if it significantly improves the problem solver s performance on a randomly chosen set of training problems. The program is guaranteed to converge to an approximate locally optimal problem solver with a high probability. The work of Gratch and DeJong (1992) in the COMPOSER system follows a similar strategy of applying a series of transformations which are proved useful on a training sample until the performance no longer improves. One difference between our approach and all these methods is that our cost model is much more coarse than the others. In ....
Gratch, J., & DeJong, G. (1992). Composer: A probabilistic solution to the utility problem in speedup-learning. In Proceedings of National Conference on Artificial Intelligence, pp. 235--240. San Jose, CA. AAAI Press.
....is a difficult, laborious task. Research in learning and planning attempts to address this important problem by developing methods that automatically acquire search control knowledge from experience. However, most work in learning and planning has been in the context of linear, state based planners[14, 8, 12]. More recently, the problem of learning search control for a nonlinear planner has been presented [3, 10, 18] Nonlinear planners have been accepted as superior to linear planners for many years, and recent experimental results support that partial order planners are more efficient than ....
J. Gratch and G. DeJong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In AAAI-92, pages 235--240, San Jose, CA, July 1992.
....current work shows that EBL provides a way of strategically applying domain axiom based consistency checks. Finally, although we did not explicitly address monitoring the utility of learned rules and filtering bad rules, we believe that utility monitoring models developed for statebased planners [4, 10] will also apply for PO planners. 5 Conclusions and Future Work In this paper, we presented snlp ebl, the first systematic implementation of explanation based search control rule learning to a PO planner. We have described the details of between snlp, snlp ebl and snlp domax remained same even ....
J. Gratch and G. DeJong. COMPOSER: A Probabilistic Solution to the Utility problem in Speed-up Learning. In Proc. AAAI 92, pp:235--240, 1992
....domains. Similar observations have been made in the context of constraint satisfaction problems [Baker94, Frost94] This inherent difficulty in recognizing the worth of a heuristic has been termed the utility problem [Minton88] and has been studied extensively in the machine learning community [Gratch92, Greiner92, Holder92, Subramanian92]. 2.1 Formulation of Adaptive problem solving Before discussing approaches to adaptive problem solving we formally state the common definition of the task (see [Gratch92, Greiner92, Laird92, Subramanian92] Adaptive problem solving requires a flexible problem solver, meaning the problem solver ....
....problem [Minton88] and has been studied extensively in the machine learning community [Gratch92, Greiner92, Holder92, Subramanian92] 2. 1 Formulation of Adaptive problem solving Before discussing approaches to adaptive problem solving we formally state the common definition of the task (see [Gratch92, Greiner92, Laird92, Subramanian92]) Adaptive problem solving requires a flexible problem solver, meaning the problem solver possesses control decisions that may be resolved in alternative ways. Given a flexible problem solver PS with control points CP 1 . CP n , where each control point CP i corresponds to a particular control ....
[Article contains additional citation context not shown here]
J. Gratch and G. DeJong, "COMPOSER: A Probabilistic Solution to the Utility Problem in Speed--up Learning," Proceedings of the National Conference on Artificial Intelligence, San Jose, CA, July 1992, pp. 235--240.
.... Doyle has persuasively argued for the merits of decision theory as a standard for evaluating artificial intelligence systems [Doyle90] and it has seen increasing acceptance, both in artificial intelligence at large [Horvitz89, Russell89, Schwuttke92, Wellman92] and machine learning in particular [Gratch92a, Greiner92a, Laird92, Subramanian92]. We will use decision theory as a common framework for characterizing the utility problem. 5 2.1 EXPECTED UTILITY Different learning decisions result in different transformed problem solvers. Decision theory relies on the observation that preferences over these different outcomes can, under ....
J. Gratch and G. DeJong, "COMPOSER: A Probabilistic Solution to the Utility Problem in Speed--up Learning," Proceedings of the National Conference on Artificial Intelligence, San Jose, CA, July 1992, pp. 235--240.
....same data. Correlated selection problems generally arise in machine learning whenever elements of the hypothesis space share some common structure. For example, in speed up learning a learning algorithm must repeatedly select one of a set of small variations to an existing search control strategy [Gratch92, Greiner92]. In inductive learning there are two issues which are naturally cast as correlated selection problems: the attribute selection problem consists of selecting one of a set of attributes to add to an existing concept description [Fayyad91, Musick93] the feature selection problem consists of ....
....of the expected difference in value between hypotheses. Therefore we need some method of estimating the sign with error no more than PCE. Efficient methods for this problem include the repeated significance test (RST) Lerche86] and the N das approach used in our earlier solution to this problem [Gratch92]. An undesirable property of these methods, however is that their sample complexity tends to infinity as the expected difference approaches zero. Instead, we introduce an indifference parameter, e, that captures the intuition that if the difference is sufficiently small, we do not care if the ....
J. Gratch and G. DeJong, "COMPOSER: A Probabilistic Solution to the Utility Problem in Speed--up Learning," Proceedings of the National Conference on Artificial Intelligence, San Jose, CA, July 1992, pp. 235--240.
....193] For any finite sample the mean will only approximate the normal distribution but given a sufficiently large sample this approximation can be quite close. This provides somewhat weaker guarantees than Chernoff bounds and is thus a simplification, but typically requires far fewer examples (see [Gratch92] for a comparison) 3.3 OBSERVATION COMPLEXITY Learning techniques must provide the information necessary to generate transformations and to estimate their incremental utility. The previous section illustrated that distribution information can be approximated by combining information from ....
J. Gratch and G. DeJong, "COMPOSER: A Probabilistic Solution to the Utility Problem in Speed-- up Learning," AAAI92, San Jose, CA, July 1992.
....to our general framework for learning to plan described in [Gratch92a] The framework provides a unifying perspective where seemingly different approaches are related through their use of common simplifying assumptions. We review the framework (Section 3) and then turn to describing the COMPOSER [Gratch92b] technique from this new perspective (Section 4) The discussion illustrates the framework s use as an analysis tool. It also provides a detailed illustration of the tradeoffs involved in somecommon learning simplifications. The next section motivates the view of learning to plan as a search ....
.... SOAR [Laird86] STATIC [Etzioni90b] PRODIGY EBL, RECEBG [Letovsky90] IMEX [braverman88 ] and PEBL [Eskey90] Ma and Wilkins illustrate a similar situation for knowledge base revision systems [Wilkins89] Systems which do not adopt this simplification include PALO [greiner92] COMPOSER [Gratch92b], and [Leckie91] 3.1.2 Generation Pruning Most learning systems employ powerful pruning techniques to reduce the space of alternatives. One way to reduce complexity is by restricting which transformations are actively considered. Most transformation vocabularies define a vast space of possible ....
[Article contains additional citation context not shown here]
J. Gratch and G. DeJong, "COMPOSER: A Probabilistic Solution to the Utility Problem in Speed--up Learning," Proceedings of the National Conference on Artificial Intelligence, San Jose, CA, July 1992.
....are based on a real world situation, and the scheduling approach was developed independently of our learning work [Bell92] The implementation includes a novel approach for improving learning efficiency. The performance of the system,alongwith previous results in artificial planningdomains [Gratch91, Gratch92], demonstrates COMPOSER s flexibility and its potential to identify beneficial knowledge in practical learning problems. 2 COMPOSER COMPOSER is a statistical approach to improving the expected utility of problem solving. The overall approach is one of generate and test hill climbing. Given an ....
....statistically over the expected distribution of problems. A transformation is adopted if it increases the expected utility of solving problems over that distribution. The generator then constructs a set of transformations to this new strategy and so on. For a complete description of the method see [Gratch92]. The algorithm is summarized in the Appendix. COMPOSER s solution is applicable in cases where the following conditions apply. 1. The control strategy space can be structured to facilitate hill climbing search. In general, the space of such strategies is so large as to make exhaustive search ....
[Article contains additional citation context not shown here]
J. Gratch and G. DeJong, "COMPOSER:A Probabilistic Solution to the Utility Problem in Speed--up Learning," Proceedings of the National Conference on Artificial Intelligence, San Jose, CA, July 1992, pp. 235--240.
.... discuss this in the context of parametric hypothesis selection problems, an abstract class of statistical learning problems where a system must select one of a finite set of hypothesizedcourses of action, where the quality of eachhypothesis is described as a function of someunknownparameters (e.g. [Gratch92,Greiner92,Kaelbling93,Moore94,Musick93]) A learning system determines and refines estimates of these parameters by paying for training examples. Weshowhow such problems can be cast as resource optimization problems, and that solutions found in this way can be significantly more efficient than solutions that do not account for the ....
....selections are at the core of manymachine learning approaches. For example, the utility problem in speed up learning is a selection problem in which a problem solving heuristic is chosen from a set of proposed candidates, where expected utility is defined as the average time to solve a problem [Gratch92, Greiner92, Minton88]. The attribute selection problem in classification learning is a problem of selecting one of a set of attributes on which to split, where utility is equated with information gain [Musick93] In reinforcement learning a system must select an action, where utility is equated with expected reward ....
J. Gratch and G. DeJong, "COMPOSER: A Probabilistic Solution to the Utility Problem in Speed--up Learning," AAAI92, San Jose, CA, July 1992, pp. 235--240.
....of the expected difference in value between hypotheses. Therefore we need some method of estimating the sign with error no more than PCE. Efficient methods for this problem include the repeated significance test (RST) Lerche86] and the N das approach used in our earlier solution to this problem [Gratch92]. An undesirable property of these methods, however is that their sample complexity tends to infinity as the expected difference approaches zero. Instead, we introduce an indifference parameter, e, that captures the intuition that if the difference is sufficiently small, we do not care if the ....
J. Gratch and G. DeJong, "COMPOSER: A Probabilistic Solution to the Utility Problem in Speed--up Learning," Proceedings of the National Conference on Artificial Intelligence, San Jose, CA, July 1992, pp. 235--240.
No context found.
Jonathan Gratch and Gerald DeJong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 235--240, 1992.
No context found.
J. Gratch and G. DeJong. Composer: A probabilistic solution to the utility problem in speed-up learning. In Proc. AAAI-92, pages 235--240, 1992.
No context found.
J. Gratch and G. DeJong. Composer: A probabilistic solution to the utility problem in speedup learning. In Proceedings of the 10th National Conference on Artificial Intelligence, pages 235--240, 1992.
No context found.
J. Gratch and G. DeJong. COMPOSER: A Probabilistic Solution to the Utility problem in Speed-up Learning. In Proc. AAAI 92, pp:235--240, 1992
No context found.
J. Gratch and G. DeJong. "Composer: A probabilistic solution to the utility problem in speedup learning, " In Proceedings of National Conf. Artificial Intelligence (AAAI), 1992.
No context found.
J. Gratch and D. DeJong. COMPOSER: A probabilistic solution to the utility problem in speed-up learning. In Proceedings of the Tenth National Conference on Artificial Intelligence, pages 235--240, San Jose, California, 1992. American Association for Artificial Intelligence.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC