| Whitehead, Steven D. and Ballard, Dana H. 1991. Learning to perceiveandactby trial and error. Machine Learning 7. |
....probabilities of classes of interpretation. This is possible by analysis of the problem where there are obvious scene based constraints such as traffic flow direction in certain lanes of a roundabout. It is also possible to learn these constraints and dependencies using appropriate techniques [47, 55]. This can be a time consuming process but is typically computed off line with only limited adaptive refinement on line. The requirement to turn conceptual scene based or task based knowledge into a readily accessible form for real time processing has been recognised in the past. Many hybrid ....
S.D. Whitehead and D.H. Ballard. "Learning to perceive and act by trial and error". Machine Learning, 7:45-83, 1991.
....investigated propositionalization methods in relational domains. They experimentally studied the intermediate language of deictic representations (DRs) DRs avoid enumerating the domain by using variables such as the block on the floor. Although DRs have led to impressive results [McCallum, 1995; Whitehead and Ballard, 1991] Finney et al. 2002] s results show that DR may also degrade learning performance within relational domains. According to [Finney et al. 2002] Relational reinforcement learning (RRL) D zeroski et al. 2001] is one way to effective learning in domains with objects. RRL is a combination of ....
S. D. Whitehead and D. H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7(1):45 -- 83, 1991.
....known causal dependencies with estimated statistical knowledge. They are essentially providing closed loop control using both topdown and bottom up messages in the propagation of belief values. They also provide the possibility of learning and re ning visual representations by observation [15] [16]. Bayes nets have been used in many demanding applications such as BATmobile [17] and TEA system [13] HMMs are also widely used in visual processing, as seen in the review of recent work on behaviour analysis below. The advantage here is that the hidden purposes of regular behaviour patterns ....
S.D. Whitehead and D.H. Ballard, \Learning to perceive and act by trial and error," Machine Learning, vol. 7, pp. 45-83, 1991.
....an exploration. 4 The Experimental Domain: Blocks World Our learning agent exists in a simulated blocks world and must learn to use its hand to remove any red or blue blocks on a green block so that block may be lifted. The choice of this problem domain was not arbitrary. Whitehead and Ballard [23] introduced it in their pioneering work on the use of deixis in relational domains. They developed the Lion algorithm to deal with the perceptual aliasing (partial observability) that resulted from using a deictic representation by avoiding the aliased states. McCallum [17] used the same domain to ....
....the problem can be reduced by keeping a short history of observations. Finally, Martin [15] used the domain to motivate and evaluate an algorithm for learning policies that generalize over initial configurations. 8 4. 1 Whitehead and Ballard s blocks world Whitehead and Ballard s blocks world [23] di#ers from ours in several ways. First, Whitehead s representation has two markers, one for perceptual information and one for taking action. In our representation, there is only one marker which the agent can move arbitrarily, and this marker is used both for gaining perceptual information and ....
[Article contains additional citation context not shown here]
Steven D. Whitehead and Dana H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7(1):45--83, 1991.
....problem of incomplete and noisy perception perception, since the algorithm presented in this section is kind of orthogonal to research on these issues. See [Bachrach and Mozer, 1991] Chrisman, 1992] Lin and Mitchell, 1992] Mozer and Bachrach, 1989] Rivest and Schapire, 1987] Tan, 1991] [Whitehead and Ballard, 1991] for approaches to learning with incomplete perception. s See for example [Barto et al. 1989] Jordan, 1989] Munro, 1987] Thrun, 1992] for more approaches to learning action models with neural networks. agent begins at the initial state s and performs the action sequence a, a2, a. ....
Steven D. Whitehead and Dana H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7:45-83, 1991.
....But after an action is performed, the environment might be in a different state, which keeps the agent from trying other actions in the same state. For simplification, we assume that both states and rewards are completely observable, avoiding problems of incomplete and noisy perception (see [3, 9, 28, 42, 46, 52, 60] for approaches to incomplete perception) In order to apply neural networks using the backpropagation training procedure [43] we further assume that the environment behaves sufficiently deterministic, i.e. that a deterministic function is capable of modeling the environment. This is an important ....
....is selected with a distinct, large probability P; st, and with the remaining probability 1 P; st an arbitrary action is selected with uniform probability distribution. Consequently, semi uniform distributed exploration performs exploitation with probability P; t, and random exploration otherwise [9, 29, 30, 60]. A second way for modifying probability distributions is used in Boltzmann distributed exploration. Many learning algorithms used in adaptive control, e.g. indirect control with forward models, or reinforcement learning based on policy iteration methods [8] allow for estimating the exploitation ....
Steven D. Whitehead and Dana H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7:45 83, 1991.
....infrared reflections measured by the agent. Outside of some mechanism such as beacon triangulation, or a state transition memory to ensure that one location cannot be mistaken for another, there is always a risk of perceptual aliasing. 2. 8 Description of Perceptual Aliasing Ballard and Whitehead [25] showed that when using an agent s observational inputs to define states, it introduces the possibility of two different situations appearing the same to the agent. This is quite an issue when doing experiments with discretevalued states, as the perceptual resolution suffers due to ....
....POMDP algorithms in which a probabilistic estimate is made of the state an agent occupies. The drawback of this type of technique is that it is computationally very expensive, and reliant on problem information regarding the exact number of hidden states [16] In addition, Whitehead and Ballard [25] suggest a method of accessing additional sensory data, however in practice their method is somewhat impractical since typically an agent is either already using all of its sensors, or is computationally unable to handle the extra data. The control policy followed by algorithms when solving POMDPs ....
S. D. Whitehead and D. H. Ballard, "Learning to Perceive and Act by Trial and Error," Machine Learning, vol. 7, pp. 45-83, 1991.
.... By evolving a population of agents for their ability to solve this task 2 however, one can easily see that there are several other solutions that are not affected by the aliasing problem 1 The notion of sensory aliasing is related to the notion of perceptual aliasing introduced by Whitehead and Ballard (1991). Perceptual aliasing was introduced to indicate an internal state corresponding to different external states that requires different motor answers in different circumstances. 2 Evolving individuals were allowed to live for 100 epochs with each epoch consisting of 200 actions. Each 5 and ....
Whitehead S.D. and Ballard D.H. (1991) Learning to perceive and act by trial and error. Machine Learning 7:45-83.
....6.1.2 New York Driving Here we present a variant of McCallum s New York driving task, which constitutes an appropriate test domain [35] with roughly 21000 environment states and 2500 observations. The agent s actions and perception are based on visual routines [59, 60] and deictic representation [1, 63]. An opportunity to compare the results of learning to human performance on a simulated environment is one of the attractions here. Action Description gaze forward left Look at closest car in lane to the left gaze forward centre Look at closest car in agent s lane gaze forward right Look at ....
Steven D. Whitehead and Dana H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7(1):45-83, 1991.
....of using predictability, or a lack thereof, as the driving force behind the creation and refinement of knowledge structures has been applied in a variety of contexts. Drescher (1991) and Shen (1993) used uncertainty in action outcomes to trigger refinement of action models, and McCallum (1995) and Whitehead and Ballard (1991) used uncertainty in predicted reward in a reinforcement learning setting to refine action policies. Virtually all of the work in this vein is based on two key assumptions. First, an assumption is made that the world is in priciple deterministic; that given enough knowledge, outcomes can be ....
Whitehead, S. D., and Ballard, D. H. 1991. Learning to perceive and act by trial and error. Machine Learning 7:45-- 83.
....in the more general question of how learning to solve a particular task can speed learning of other tasks in the same domain. This kind of toy domain has been used in a huge amount of widely known planning research. It has also been used in some work on learning. Particularly relevant is work by (Whitehead Ballard 1991) on partially observable blocks worlds and by (Baum Durdanovic 2000) and by (Dzeroski, de Raedt, Blockeel 1998) on learning truly general block stacking behaviors. ########## ### ###### Most work on reinforcement learning assumes that the agent has complete, perfect perception of its ....
Whitehead, S. D., and Ballard, D. H. 1991. Learning to perceive and act by trial and error. ####### ######## 7(1):45-83.
.... forms of dynamic programming than are conventionally used (e.g. 9,39,42] Empirical (simulation) results using reinforcement learning combined with neural networks or other associative memory structures have shown robust efficient learning on a variety of nonlinear control problems (e.g. [5,13,19,20,24,25,29,32,38,43]) An overview of the role of reinforcement learning within neural network approaches is provided by [1] For a readily accessible example of reinforcement learning using neural networks the reader is referred to Anderson s article on the inverted pendulum problem [43] Studies of ....
Whitehead, S.D., Ballard, D.H. (in press) Learning to perceive and act by trial and error. Machine Learning.
....representation of the input [7] When interfacing to the real world by real valued sensors, the selection of the interval granularity becomes arelevant issue. A ne grained classi cation translates in a large search space, and a coarse grained classi cation tends to induce perceptual aliasing [12]. We discuss some criteria to face this trade o in section 2, where we also discuss the relationships between the interval based representation partitioning the sensor data space, and the traditional gridworld, where the partition is done on spatial dimensions, strictly related to the con ....
....subpopulation, and the lower is the accuracy of the LCS to learn the correct action to take from a sensorial cluster. This is another motivation to keep the cluster (and the corresponding state) as small as possible. Let us notice that cluster aliasing is slightly di erent from perceptual aliasing [12], de ned as having the same internal representation for di erent states. Perceptual aliasing concerns the representation of the percepts, cluster aliasing concerns the fact that actions done in a state do not bring out from it. In perceptual aliasing the problem is that there is the possibility to ....
S. D. Whitehead and D. H. Ballard. Learning to perceive and act bytrialand error. Machine Learning, 7:45-83, 1991.
No context found.
Whitehead, Steven D. and Ballard, Dana H. 1991. Learning to perceiveandactby trial and error. Machine Learning 7.
No context found.
Whitehead, S. D., and Ballard, D. H. 1991. Learning to perceive and act by trial and error. Machine Learning 7:45-83.
No context found.
S. D. Whitehead and D. H. Ballard, \Learning to perceive and act by trial and error," Machine Learning, vol. 7, pp. 45-83, 1991.
No context found.
S. D. Whitehead and D. H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7:45-83, 1991.
No context found.
S. D. Whitehead and D. H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7:45-83, 1991.
No context found.
S.D WHITEHEAD AND D.H BALLARD. Learning to perceive and act by trial and error. Machine Learning 7, 45--83 (1991).
No context found.
S.D. Whitehead and D.H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7:45--83, 1991.
No context found.
S. D. Whitehead and D. H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7(1):45 -- 83, 1991.
No context found.
S. Whitehead and D. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7(1):45--83, 1991.
No context found.
S.D WHITEHEAD AND D.H BALLARD. Learning to perceive and act by trial and error. Machine Learning 7, 45--83 (1991).
No context found.
S. D. Whitehead and D. H. Ballard. "Learning to perceive and act by trial and error", Machine Learning, 7(1):45--83, 1991.
No context found.
S. D. Whitehead and D. H. Ballard. Learning to perceive and act by trial and error. Machine Learning, 7(1):45 -- 83, 1991. 70
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC