Results 1  10
of
1,087
Learning to predict by the methods of temporal differences
 MACHINE LEARNING
, 1988
"... This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior. Whereas conventional predictionlearning methods assign credit by means of the difference between predi ..."
Abstract

Cited by 1501 (56 self)
 Add to MetaCart
(Show Context)
This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior. Whereas conventional predictionlearning methods assign credit by means of the difference between predicted and actual outcomes, the new methods assign credit by means of the difference between temporally successive predictions. Although such temporaldifference methods have been used in Samuel's checker player, Holland's bucket brigade, and the author's Adaptive Heuristic Critic, they have remained poorly understood. Here we prove their convergence and optimality for special cases and relate them to supervisedlearning methods. For most realworld prediction problems, temporaldifference methods require less memory and less peak computation than conventional methods and they produce more accurate predictions. We argue that most problems to which supervised learning is currently applied are really prediction problems of the sort to which temporaldifference methods can be applied to advantage.
Markov chains for exploring posterior distributions
 Annals of Statistics
, 1994
"... Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at ..."
Abstract

Cited by 1122 (6 self)
 Add to MetaCart
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at
Synchronization and linearity: an algebra for discrete event systems
, 2001
"... The first edition of this book was published in 1992 by Wiley (ISBN 0 471 93609 X). Since this book is now out of print, and to answer the request of several colleagues, the authors have decided to make it available freely on the Web, while retaining the copyright, for the benefit of the scientific ..."
Abstract

Cited by 369 (11 self)
 Add to MetaCart
The first edition of this book was published in 1992 by Wiley (ISBN 0 471 93609 X). Since this book is now out of print, and to answer the request of several colleagues, the authors have decided to make it available freely on the Web, while retaining the copyright, for the benefit of the scientific community. Copyright Statement This electronic document is in PDF format. One needs Acrobat Reader (available freely for most platforms from the Adobe web site) to benefit from the full interactive machinery: using the package hyperref by Sebastian Rahtz, the table of contents and all LATEX crossreferences are automatically converted into clickable hyperlinks, bookmarks are generated automatically, etc.. So, do not hesitate to click on references to equation or section numbers, on items of thetableofcontents and of the index, etc.. One may freely use and print this document for one’s own purpose or even distribute it freely, but not commercially, provided it is distributed in its entirety and without modifications, including this preface and copyright statement. Any use of thecontents should be acknowledged according to the standard scientific practice. The
A Logic for Reasoning about Time and Reliability
 Formal Aspects of Computing
, 1994
"... We present a logic for stating properties such as, "after a request for service there is at least a 98% probability that the service will be carried out within 2 seconds". The logic extends the temporal logic CTL by Emerson, Clarke and Sistla with time and probabilities. Formulas are inter ..."
Abstract

Cited by 360 (1 self)
 Add to MetaCart
We present a logic for stating properties such as, "after a request for service there is at least a 98% probability that the service will be carried out within 2 seconds". The logic extends the temporal logic CTL by Emerson, Clarke and Sistla with time and probabilities. Formulas are interpreted over discrete time Markov chains. We give algorithms for checking that a given Markov chain satisfies a formula in the logic. The algorithms require a polynomial number of arithmetic operations, in size of both the formula and This research report is a revised and extended version of a paper that has appeared under the title "A Framework for Reasoning about Time and Reliability" in the Proceeding of the 10 th IEEE Realtime Systems Symposium, Santa Monica CA, December 1989. This work was partially supported by the Swedish Board for Technical Development (STU) as part of Esprit BRA Project SPEC, and by the Swedish Telecommunication Administration. the Markov chain. A simple example is inc...
Linear leastsquares algorithms for temporal difference learning
 Machine Learning
, 1996
"... Abstract. We introduce two new temporal difference (TD) algorithms based on the theory of linear leastsquares function approximation. We define an algorithm we call LeastSquares TD (LS TD) for which we prove probabilityone convergence when it is used with a function approximator linear in the adju ..."
Abstract

Cited by 257 (1 self)
 Add to MetaCart
(Show Context)
Abstract. We introduce two new temporal difference (TD) algorithms based on the theory of linear leastsquares function approximation. We define an algorithm we call LeastSquares TD (LS TD) for which we prove probabilityone convergence when it is used with a function approximator linear in the adjustable parameters. We then define a recursive version of this algorithm, Recursive LeastSquares TD (RLS TD). Although these new TD algorithms require more computation per timestep than do Sutton's TD(A) algorithms, they are more efficient in a statistical sense because they extract more information from training experiences. We describe a simulation experiment showing the substantial improvement in learning rate achieved by RLS TD in an example Markov prediction problem. To quantify this improvement, we introduce the TD error variance of a Markov chain, arc,, and experimentally conclude that the convergence rate of a TD algorithm depends linearly on ~ro. In addition to converging more rapidly, LS TD and RLS TD do not have control parameters, such as a learning rate parameter, thus eliminating the possibility of achieving poor performance by an unlucky choice of parameters.
Modelchecking algorithms for continuoustime Markov chains
 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
, 2003
"... Continuoustime Markov chains (CTMCs) have been widely used to determine system performance and dependability characteristics. Their analysis most often concerns the computation of steadystate and transientstate probabilities. This paper introduces a branching temporal logic for expressing realt ..."
Abstract

Cited by 231 (45 self)
 Add to MetaCart
(Show Context)
Continuoustime Markov chains (CTMCs) have been widely used to determine system performance and dependability characteristics. Their analysis most often concerns the computation of steadystate and transientstate probabilities. This paper introduces a branching temporal logic for expressing realtime probabilistic properties on CTMCs and presents approximate model checking algorithms for this logic. The logic, an extension of the continuous stochastic logic CSL of Aziz et al., contains a timebounded until operator to express probabilistic timing properties over paths as well as an operator to express steadystate probabilities. We show that the model checking problem for this logic reduces to a system of linear equations (for unbounded until and the steadystate operator) and a Volterra integral equation system (for timebounded until). We then show that the problem of modelchecking timebounded until properties can be reduced to the problem of computing transient state probabilities for CTMCs. This allows the verification of probabilistic timing properties by efficient techniques for transient analysis for CTMCs such as uniformization. Finally, we show that a variant of lumping equivalence (bisimulation), a wellknown notion for aggregating CTMCs, preserves the validity of all formulas in the logic.
A Random Walks View of Spectral Segmentation
, 2001
"... We present a new view of clustering and segmentation by pairwise similarities. We interpret the similarities as edge flows in a Markov random walk and study the eigenvalues and eigenvectors of the walk's transition matrix. This view shows that spectral methods for clustering and segmentati ..."
Abstract

Cited by 216 (7 self)
 Add to MetaCart
We present a new view of clustering and segmentation by pairwise similarities. We interpret the similarities as edge flows in a Markov random walk and study the eigenvalues and eigenvectors of the walk's transition matrix. This view shows that spectral methods for clustering and segmentation have a probabilistic foundation. We prove that the Normalized Cut method arises naturally from our framework and we provide a complete characterization of the cases when the Normalized Cut algorithm is exact. Then we discuss other spectral segmentation and clustering methods showing that they are essentially the same as NCut.
Deeper inside pagerank
 Internet Mathematics
, 2004
"... Abstract. This paper serves as a companion or extension to the “Inside PageRank” paper by Bianchini et al. [Bianchini et al. 03]. It is a comprehensive survey of all issues associated with PageRank, covering the basic PageRank model, available and recommended solution methods, storage issues, existe ..."
Abstract

Cited by 207 (4 self)
 Add to MetaCart
(Show Context)
Abstract. This paper serves as a companion or extension to the “Inside PageRank” paper by Bianchini et al. [Bianchini et al. 03]. It is a comprehensive survey of all issues associated with PageRank, covering the basic PageRank model, available and recommended solution methods, storage issues, existence, uniqueness, and convergence properties, possible alterations to the basic model, suggested alternatives to the traditional solution methods, sensitivity and conditioning, and finally the updating problem. We introduce a few new results, provide an extensive reference list, and speculate about exciting areas of future research. 1.
Randomwalk computation of similarities between nodes of a graph, with application to collaborative recommendation
 IEEE Transactions on Knowledge and Data Engineering
, 2006
"... Abstract—This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted and undirected graph. It is based on a Markovchain model of random walk through the database. More precisely, we compute quantities (the average comm ..."
Abstract

Cited by 188 (19 self)
 Add to MetaCart
(Show Context)
Abstract—This work presents a new perspective on characterizing the similarity between elements of a database or, more generally, nodes of a weighted and undirected graph. It is based on a Markovchain model of random walk through the database. More precisely, we compute quantities (the average commute time, the pseudoinverse of the Laplacian matrix of the graph, etc.) that provide similarities between any pair of nodes, having the nice property of increasing when the number of paths connecting those elements increases and when the “length ” of paths decreases. It turns out that the square root of the average commute time is a Euclidean distance and that the pseudoinverse of the Laplacian matrix is a kernel matrix (its elements are inner products closely related to commute times). A principal component analysis (PCA) of the graph is introduced for computing the subspace projection of the node vectors in a manner that preserves as much variance as possible in terms of the Euclidean commutetime distance. This graph PCA provides a nice interpretation to the “Fiedler vector, ” widely used for graph partitioning. The model is evaluated on a collaborativerecommendation task where suggestions are made about which movies people should watch based upon what they watched in the past. Experimental results on the MovieLens database show that the Laplacianbased similarities perform well in comparison with other methods. The model, which nicely fits into the socalled “statistical relational learning ” framework, could also be used to compute document or word similarities, and, more generally, it could be applied to machinelearning and patternrecognition tasks involving a relational database. Index Terms—Graph analysis, graph and database mining, collaborative recommendation, graph kernels, spectral clustering, Fiedler vector, proximity measures, statistical relational learning. 1