| C. G. Atkeson, S. A. Schaal, and Andrew W. Moore. Locally weighted learning. AI Review, 11, 1997. |
....classification. In these problems, a clear cut, supervised criterion classification error is available and can be optimized for. See also [11] for a different way of supervising clustering. This literature is too wide to survey here, but some relevant examples include [10, 5, 3, 6] and [1] also gives a good overview of some of this work. While these methods often learn good metrics for classification, it is less clear whether they can be used to learn good, general metrics for other algorithms such as K means, particularly if the information available is less structured than the ....
C. Atkeson, A. Moore, and S. Schaal. Locally weighted learning. AI Review, 1996.
....and Ivo Dntsch, Gnther Gediga Brock University St. Catherines, Ontario, L2S 3AI, Canada duentsch gediga cosc.brocku.ca 1 Introduction k Nearest Neighbours (kNN) is a popular method for classification. It is simple but effective in many cases [Hand et al. 2001,Han and Kamber, 2000,Atkeson et al. 1997] For a data record to be classified, nearest neighbours are retrieved, which form a neighbourhood of . Majority voting with or without weighting among the data records in the neighbourhood is used to decide the classification for . To apply kNN we need to choose a value for and a ....
Atkeson, C. G., Moore, A. W., and Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, 11:11--73.
....presented in Figure 7.1. It may be possible to create an auxiliary classifier, which we call a bridg 105 ing agent, from DBA that can predict the value of the A n 1 attribute. To be more specific, by deploying regression methods (e.g. CART [ Breiman et al. 1984 ] locally weighted regression [ Atkeson, Schaal, Moore, 1999 ] linear regression fit [ Myers, 1986 ] MARS [ Friedman, 1991 ] for continuous attributes and machine learning classification algorithms for categorical attributes, data site A can compute one or more auxiliary classifier agents C Aj # that predict the value of attribute A n 1 based on the ....
Atkeson, C. G.; Schaal, S. A.; and Moore, A. W. 1999. Locally weighted learning. AI Review, In press.
....dimensions; and, as the number of dimensions increases, finding similarity matches in high dimensional space becomes extremely difficult because of troubles measuring distance. Weights can be applied to each dimension in cases where some dimensions are considered more or less important than others [54]. This equates to lengthening or shortening the axes in Euclidean space. Moore and Lee [55] provide strategies for eliminating the least relevant dimensions from the space. In particular, they provide efficient ways to repeatedly leave one dimension out and then examine the results in order to ....
C. Atkeson, S. Schaal and A. Moore. Locally weighted learning. AI Review. 1997.
....learning is 20 complete avoids the theoretical problem generated by altering the state occupancy probabilities. 2. 11 Locally Weighted Regression for Approximation Locally weighted regression (LWR) is well suited for generating a smoothed line from data points having a non uniform distribution [30, 31, 32, 28]. Kernel regression is not as well suited for problems having an irregular distribution of data points. K nearest neighbor methods are also unsuited (K 1) due to the irregularity in the data density in the Q table. Neural networks, although having produced good results in some applications ....
....becoming extremely large near Q table data points, as well as not Figure 4 4. Performance of LWR compared to standard Q learning for the arena shown in Figure 4 1 34 having broad shape such as is found with a Gaussian distribution. This is somewhat contrary to what is reported by Atkeson et al. [32], who found that the choice of the weighting function was not generally critical to performance, though did note the existence of some exceptions. The high degree of roughness in the value function seems to demand a narrow weighting function in order to diminish the influence of irrelevant data ....
C. G. Atkeson, A. W. Moore, and S. Schaal, "Locally Weighted Learning," Artificial Intelligence Review, 11:(1-5), pp. 11-73, 1997.
....D(x; y) v u u t n X i=1 w i (x i Gamma y i ) 2 : 5) These weights enable the neighborhood to elongate less important feature dimensions, and, at the same time, to constrict the most influential ones. Note that the technique is query based because weightings depend on the query [1, 3]. 5 Local Flexible Metric Classification based on SVMs Since support vectors are located closely to the boundary surface, we can estimate how close a query point q is to the boundary by computing its distance from the closest support vector: B q = min s i 2SV kq Gamma s i k (6) where s i ....
C. Atkeson, A.W. Moore, and S. Schaal, "Locally Weighted Learning", Artificial Intelligence Review. 11:11-73. 1997.
....(x,L,R) for test where T , D, L, and Rare the classification labels corresponding respectively to the top, bottom, left, and right edge of the cars. 2. 2 Estimating P (Ljx)using Kernel Regression We have used kernel regression, a Memory Based Learning technique (abbreviated MBL, see e.g. [1]) to estimate the posterior probability of each of the four classifications from the training data. Kernel regression (KR) or locally weighted averaging, follows a very simple algorithm. If we again restrict our attention to the left edge L case, KR proceeds as follows: ffl Collect a large ....
C. G. Atkeson, S. A. Schaal, and A. W. Moore. Locally weighted learning. AI Review, 11:11--73, 1997.
....for which this is also the case in the regression setting. We compare naive Bayes to two state of the art methods for numeric prediction: locally weighted linear regression, and model trees. Locally weighted linear regression (LWR) is a combination of instance based learning and linear regression (Atkeson, Moore Schaal, 1997). Instead of performing a linear regression on the full, unweighted dataset, it performs a weighted linear regression, weighting the training instances according to their distance to the test instance at hand. In other words, it performs a local linear regression by giving those training instances ....
....be done for each new test instance, which makes the method computationally quite expensive. However, it also makes it highly exible, and enables it to approximate non linear target functions. There are many di erent ways of implementing the weighting scheme for locally weighted linear regression (Atkeson, Moore Schaal, 1997). A common approach, which we also employ here, is to weight the training instances according to a triangular kernel centered at the test instance, setting the kernel width to the distance of the test instance s kth nearest neighbour. The best value for k is determined using ten fold ....
Atkeson, C. G., Moore, A. W. & Schaal, S. (1997). Locally weighted learning. Articial Intelligence Review, 11, 11-73.
....distance measure. Thus, the good classification results of a NN classifier in Table 1 are an indication for an appropriate choice and sufficiently correct computation of the distance measure. The best results for the mutagenesis dataset are obtained by using the Variable Kernel Method described in [23, 7] and its extension from [17] where the number of instances used for classification is reduced. 4.2. Learning of Prototypes Sometimes we want not only to distinguish between mutagenic and nonmutagenic compounds but also to find common structural properties of mutagenic compounds or substructures ....
A. W. Moore C. G. Atkeson and S.Schaal. Locally weighted learning. AI review, 11, 1996.
....approach based on local t, on the other hand, aims at predicting the response at a point by utilizing statistical properties of the nearby data. Examples of such approach are local regression in statistics (Cleveland Delvin, 1988) and locally weighted learning in arti cial intelligence (Atkeson, Moore, Schaal, 1997). Local models with small number of parameters are usually sucient to describe function variations in small neighborhoods. However, they generally fail to provide good estimates in high dimensional situations, since as the dimensions increase the bandwidth in each coordinate should be selected ....
Atkeson, C. G., Moore, A. W., & Schaal, S. (1997). Locally weighted learning. Articial Intelligence Review, 11, 11-73.
....the network weights will converge to a particular value in the steady state. If that input distribution shifts, the parameters will once again shift again. It can be shown that after this shift, the error on the previous examples will increase. An alternative approach is instance based learning [ Atkeson et al. 1997 ] also known as memory based or lazy learning) that does not have this problem of forgetting because all examples are kept in memory. For every experience, the example is recorded and predictions and generalizations are generated in real time in response to query. Unlike parametric models such as ....
....y i ; i 2 [1; n] of the neighbors: P w i y i P w i . Locally weighted regression (LWR) is similar to kernel regression. If the data is distributed on a regular grid from any boundary, LWR and kernel regression are equivalent. For irregular data distributions, LWR tends to be more accurate [ Atkeson et al. 1997 ] LWR ts a local model to nearby weighted data. The models are linear around the query point with respect to some unknown parameters. This unknown parameter vector, b 2 k , is found by minimizing the locally weighted sum of the squared residuals: E = 1 2 P n i=1 w i (x T i b y i ) ....
[Article contains additional citation context not shown here]
C. G. Atkeson, S. A. Schaal, and Andrew W. Moore. Locally weighted learning. AI Review, 11:11-73, 1997.
....D(x; y) v u u t q X i=1 w i (x i Gamma y i ) 2 : 8) These weights enable the neighborhood to elongate less important feature dimensions, and, at the same time, to constrict the most influential ones. Note that the technique is querybased because weightings depend on the query [1, 2]. A justification for (4) and, hence, 5) may go as follows. Suppose that the value of r i (z) is small, which implies a large weight along dimension i. Consequently, the neighborhood gets shrinked along that direction. This, in turn, penalizes points along dimension i that are moving away from ....
C. Atkeson, A.W. Moore, and S. Schaal, "Locally Weighted Learning," Artificial Intelligence Review. 11:11-73. 1997.
.... regression approach, has to take a set of decisions related to the model structure (e.g. the number of neighbors, the kernel function, the parametric family, the distance metric) In local learning literature several methods have been proposed to automatically select the adequate configuration [2 ].In previous work [4, 3]we studied the PRESS statistic which is a simple, well founded and economical result of linear statistical analysis to perform leave one out cross v alidation and to assess the performance in generalization of local linear models. Here, w e propose a modelselection ....
C. G. Atkeson, A. W. Moore, and S. Schaal. Locally weighted learning. A rtificial IntelligenceReview, 11(1--5):11--73, 1997.
....distance computation D(x; y) pP q i=1 w i (x i Gamma y i ) 2 : These weights enable the neighborhood to elongate less important feature dimensions, and, at the same time, to constrict the most influential ones. Note that the technique is query based because weightings depend on the query [1]. 3 Estimation Since both Pr(jjx) and Pr(jjx i = z i ) in (3) are unknown, we must estimate them using the training data fxn ; yn g N n=1 in order for the relevance measure (3) to be useful in practice. Here yn 2 f1; Delta Delta Delta ; Jg. The quantity Pr(jjx) is estimated by considering a ....
Atkeson, C., Moore, A.W., and Schaal, S. (1997). "Locally Weighted Learning," AI Review. 11:11-73.
....distance computation D(x; y) v u u t q X i=1 w i (x i Gamma y i ) 2 : 6) These weights enable the neighborhood to elongate less important feature dimensions, and to constrict the most influential ones. Note that the technique is query based because weightings depend on the query [1]. 3 Estimation Since both Pr(jjz) and Pr(jjx i = z i ) in (3) are unknown, we must estimate them using the training data fxn ; yn g N n=1 in order for the relevance measure (4) to be useful in practice. Here yn 2 f1; Delta Delta Delta ; Jg. The quantity Pr(jjz) is estimated by considering a ....
C. Atkeson, A.W. Moore, and S. Schaal, "Locally Weighted Learning," AI Review. 11:11-73. 1997.
No context found.
Atkeson, C. G., Moore, A. W., & Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, 11, 11-- 73.
No context found.
Atkeson, C., Moore, A. & Schaal, S. (1997). Locally weighted learning. Artif. Intel. Rev., 11, 76--113.
No context found.
Atkeson, C. G., Moore, A. W., & Schaal, S, (1997a). "Locally weighted learning." Artificial Intelligence Review, 11, 1-5, pp.11-73.
No context found.
Atkeson, C., Moore, A., & Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, 11(4), 76--113.
No context found.
Atkeson,C.G., Moore,A.W. & Schaal,S.(in press), "Locally weighted learning", Artificial Intelligence Review.
No context found.
Atkeson, C., Moore, A. & Schaal, S. (1997). Locally weighted learning. Arti cial Intelligence Review, 11, 76-113.
No context found.
Atkeson, C., Moore, A., & Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, 11 (4), 76--113.
....to this query point. In the simplest form, this approach becomes a nearest neighbor method where the predicted output equals the output of the nearest data point seen so far [6] More sophisticated methods use several neighbors to fit a simple parametric model to smoothly interpolate predictions [3]. Function ap proximation with these nonparametric methods is in the spirit of a Taylor series expansion at the query point. An advantage is that no commitment needs to be made as to how large the learning system has to be the local parametric model is calculated for every query point from ....
....from such low dimensional distributions, they, as mentioned above already, quickly become computationally infeasible and also tend to be numerically less robust. This effect is due to only exploiting the low dimensional distributions implicitly by a regularization technique called Ridge Regression [3]. However, if we can exploit the low dimensional distributions explicitly by performing a local dimensionality reduction of the data before we apply our nonparametric learning techniques, we should be able to extend nonparametric learning and its favorable incremental learning properties to high ....
[Article contains additional citation context not shown here]
Atkeson,C.G., Moore,A.W. & Schaal,S.(in press), "Locally weighted learning", Artificial Intelligence Review.
....30 Degree of freedom humanoid robot in our laboratory. Over the past years, we have been developing statistical learning methods that are particularly well suited for on line learning in robotics and human augmentation. These techniques are generally classified as locally weighted learning (LWL) [7], 8] as they emphasize statistical inference solely based on a carefully selected set of spatial neighbors around a point of interest. In the most recent advancements, we succeeded in combining locally weighted learning ( 9] 10] 11] with probabilistic learning to achieve a novel form of ....
....w kk T kk = # # # # exp 1 2 xc Dxc (1) where c k is the center of the k th linear model, and D k corresponds to a positive semi definite distance metric that determines the size and shape of region of validity of the linear model. Other kernel functions are possible ([7]) but add only minor differences to the quality of function fitting. The most straightforward development of our probabilistic LWL algorithm is by reviewing memory based Locally Weighted Regression (LWR) 14] as summarized in the following pseudo code: The LWR Algorithm: x x : W xxDxx ....
C. G. Atkeson, A. W. Moore, and S. Schaal, "Locally weighted learning," Artificial Intelligence Review, vol. 11, pp. 11-73, 1997.
....the algorithmic details underlying the software. Alternatively, the Locally Weighted Learning student may choose to read this without using the software at all (although we believe that using Vizier is an excellent way to learn about LWL) Additional information and examples of LWL can be found in [1]. 1.1 The Vizier 1.0 User Interface Throughout the tutorial you will be requested to perform some operations using Vizier . These will be typeset in a different font as follows: File Open j1.mbl This is shorthand notation for: Go to the File menu. Select the Open option. Choose the file ....
C. Atkeson, S. Schaal, and A. Moore. Locally weighted learning. AI Review, 1995.
....advanced in form of Support Vector Machines, excels in classification and finite batch learning problems, but has yet to show compelling performance in regression and incremental learning. In contrast, techniques from nonparametric regression, in particular the methods of locally weighted learning [2], have recently advanced to meet al..l the requirements of real time learning in high dimensional spaces. In this paper, we will describe how one of the most highly developed algorithms amongst them, Locally Weighted Projection Regression (LWPR) accomplishes learning of a highly nonlinear model for ....
....[8] of our robot instead of the global 21 input dimensions. This property will be a key element in our approach to learning such models. 3 Locally Weighted Projection Regression The core concept of our learning approach is to approximate nonlinear functions by means of piecewise linear models [2]. The learning system automatically determines the appropriate number of local models, the parameters of the hyperplane in each model, and also the region of validity, called receptive field (RF) of each of the model, usually formalized as a Gaussian kernel: w k = exp( 1 2 (x ; c k ) T D k ....
Atkeson, C., Moore, A. & Schaal, S. (1997). Locally weighted learning. Artif. Intel. Rev., 11, 76--113.
....advanced in form of Support Vector Machines, excels in classification and finite batch learning problems, but has yet to show compelling performance in regression and incremental learning. In contrast, techniques from nonparametric regression, in particular the methods of locally weighted learning [2], have recently advanced to meet al..l the requirements of real time learning in highdimensional spaces. In this paper, we will describe how one of the most highly developed algorithms amongst them, Locally Weighted Projection Regression (LWPR) accomplishes learning of a highly nonlinear model for ....
....[7] of our robot instead of the global 21 input dimensions. This property will be a key element in our approach to learning such models. 3 Locally Weighted Projection Regression The core concept of our learning approach is to approximate nonlinear functions by means of piecewise linear models [2]. The learning system automatically determines the appropriate number of local models, the parameters of the hyperplane in each model, and also the region of validity, called receptive field (RF) of each of the model, usually formalized as a Gaussian kernel: w k = exp( Gamma 1 2 (x Gamma c k ....
Atkeson, C., Moore, A. & Schaal, S. (1997). Locally weighted learning. Artif. Intel. Rev., 11, 76--113.
....order polynomials, along the projection, which greatly simplifies the function approximation problem. Local projection regression can thus borrow most of its statistical properties from the well established methods of locally weighted learning and nonparametric regression (Hastie Loader, 1993, Atkeson, Moore, and Schaal, 1997). Counterintuitive to the curse of dimensionality (Scott, 1992) local regression methods can work successfully in high dimensional spaces (Vijayakumar Schaal, 1998) as we will empirically demonstrate below. The justification for this statement comes from empirical investigations of movement ....
Atkeson, C.G., Moore, A.W., & Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review 11:11-73.
No context found.
C. G. Atkeson, S. A. Schaal, and Andrew W. Moore. Locally weighted learning. AI Review, 11, 1997.
No context found.
C. Atkeson S. Schaal and A. Moore. Locally weighted learning. AI Review, 11, 1997.
No context found.
Atkeson, C. G.; Moore, A. W.; and Schaal, S. 1997. Locally weighted learning. Artificial Intelligence Review 11(1-5):11--73.
No context found.
ATKESON, C. G., MOORE, A. W., AND SCHAAL, S. 1997a. Locally weighted learning. In Artificial Intelligence Review. 11(15) :11-73.
No context found.
C. Atkeson, A. Moore, and S. Schaal. Locally weighted learning. AI Review, 1996.
No context found.
Atkeson, C. G.; Moore, A. W.; and Schaal, S. 1997. Locally weighted learning. Artificial Intelligence Review 11(1-5):11--73.
No context found.
) Atkeson, C. G.; Moore, A. W.; and Schaal, S. 1997. Locally weighted learning. Artificial Intelligence Review 11(1-5):11--73.
No context found.
Atkeson, C. G., Moore, A. W., and Schaal, S., "Locally Weighted Learning," Artificial Intelligence Review, Vol. 11, No. 1-5, 1997, pp. 11--73. 473
No context found.
) Atkeson, C. G.; Moore, A. W.; and Schaal, S. 1997. Locally weighted learning. Artificial Intelligence Review 11(1-5):11--73.
No context found.
C. G. Atkeson, S. A. Schall, and A. W. Moore, "Locally weighted learning," AI Rev., vol. 11, pp. 11--73, 1997.
No context found.
C. Atkeson, A. Moore, and S. Schaal. Locally weighted learning. AI Review, 1996.
No context found.
C. Atkeson, A. Moore, and S. Schaal. Locally weighted learning. AI Review, 1996.
No context found.
C.G. Atkeson, A.W. Moore, and S. Schaal, `Locally weighted learning', Artificial Intelligence Review, 11, 11--73, (1997).
No context found.
C. Atkeson, A.W. Moore, and S. Schaal, "Locally Weighted Learning," AI Review. 11:11-73. 1997.
No context found.
C. Atkeson, A. Moore, and S. Schaal, "Locally weighted learning," AI Review, vol. 11, pp. 11--73, April 1997.
No context found.
C. Atkeson, A. Moore and S. Schaal. Locally weighted learning. AI Review, 11: 11--73, 1997.
No context found.
C. Atkeson, A. Moore, and S. Schaal, "Locally weighted learning," AI Review, vol. 11, pp. 11--73, April 1997.
No context found.
C. Atkeson, A. Moore, and S. Schaal. Locally weighted learning. AI Review, 1996.
No context found.
Atkeson C., Moore A., Schaal S.,(1996). Locally Weighted Learning. Artificial Intelligence Review.
No context found.
C. Atkeson S. Schaal and A. Moore. Locally weighted learning. AI Review, 11, 1997.
No context found.
C hristopher G. Atkeson, Andrew W. Moore, and Stefan Schaal, "Locally weighted learning," Artificial Intelligence Review,vol. 11, no. 1--5, pp. 11--73, Feb. 1997.
No context found.
- Atkeson, C. G., Moore, A. W., & Schaal, S. "Locally Weighted Learning."Artificial Intelligence Review, 11:11-73, 1997
First 50 documents Next 50
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC