Policy gradient methods for reinforcement learning with function approximation.
 In NIPS,
, 1999
"... Abstract Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining a policy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly repres ..."
, but has several limitations. First, it is oriented toward finding deterministic policies, whereas the optimal policy is often stochastic, selecting different actions with specific probabilities (e.g., see In this paper we explore an alternative approach to function approximation in RL. Rather than
The GENITOR Algorithm and Selection Pressure: Why RankBased Allocation of Reproductive Trials is Best
 Proceedings of the Third International Conference on Genetic Algorithms
, 1989
"... This paper reports work done over the past three years using rankbased allocation of reproductive trials. New evidence and arguments are presented which suggest that allocating reproductive trials according to rank is superior to fitness proportionate reproduction. Ranking can not only be used to s ..."
of 1) Holland's schema theorem, 2) DeJong's standard test suite, and 3) a set of neural net optimization problems that are larger than the problems in the standard test suite. The GENITOR algorithm is also discussed; this algorithm is specifically designed to allocate reproductive trials
Simple fast algorithms for the editing distance between trees and related problems
 SIAM J. COMPUT
, 1989
"... Ordered labeled trees are trees in which the lefttoright order among siblings is. significant. The distance between two ordered trees is considered to be the weighted number of edit operations (insert, delete, and modify) to transform one tree to another. The problem of approximate tree matching i ..."
Ordered labeled trees are trees in which the lefttoright order among siblings is. significant. The distance between two ordered trees is considered to be the weighted number of edit operations (insert, delete, and modify) to transform one tree to another. The problem of approximate tree matching
FiniteState Transducers in Language and Speech Processing
 Computational Linguistics
, 1997
"... Finitestate machines have been used in various domains of natural language processing. We consider here the use of a type of transducers that supports very efficient programs: sequential transducers. We recall classical theorems and give new ones characterizing sequential stringtostring transducer ..."
tostring transducers. Transducers that output weights also play an important role in language and speech processing. We give a specific study of stringtoweight transducers, including algorithms for determinizing and minimizing these transducers very efficiently, and characterizations of the transducers admitting
Architectural Mismatch or Why it's hard to build systems out of existing parts
, 1995
"... Many would argue that future breakthroughs in software productivity will depend on our ability to combine existing pieces of software to produce new applications. An important step towards this goal is the development of new techniques to detect and cope with mismatches in the assembled parts. Some ..."
problems of composition are due to lowlevel issues of interoperability, such as mismatches in programming languages or database schemas. However, in this paper we highlight a different, and in manywaysmore pervasive, class of problem: architectural mismatch. Specifically, we use our experience in building
Estimation and Inference in Large Heterogeneous Panels with a Multifactor Error Structure
, 2004
"... This paper presents a new approach to estimation and inference in panel data models with a multifactor error structure where the unobserved common factors are (possibly) correlated with exogenously given individualspecific regressors, and the factor loadings differ over the cross section units. The ..."
. The basic idea behind the proposed estimation procedure is to filter the individualspecific regressors by means of (weighted) crosssection aggregates such that asymptotically as the crosssection dimension ( N) tends to infinity the differential effects of unobserved common factors are eliminated
Universal Plans for Reactive Robots in Unpredictable Environments
, 1987
"... In: Proc 10th IJCAI, 1987, 1039ff. To date, reactive robot behavior has been achieved only through manual programming. This paper describes a new kind of plan, called a "universal plan", which can be synthesized automatically, yet generates appropriate behavior in unpredictable environmen ..."
environments. In classical planning work, problems were posed with unique initial and final world states; in my approach a problem specifies only a goal condition. The planner is thus unable to commit to any specific future course of events but must specify appropriate reactions for anticipated situations
The Node Distribution of the Random Waypoint Mobility Model for Wireless Ad Hoc Networks
, 2003
"... The random waypoint model is a commonly used mobility model in the simulation of ad hoc networks. It is known that the spatial distribution of network nodes moving according to this model is, in general, nonuniform. However, a closedform expression of this distribution and an indepth investigation ..."
node distribution generated by random waypoint mobility. More specifically, we consider a generalization of the model in which the pause time of the mobile nodes is chosen arbitrarily in each waypoint and a fraction of nodes may remain static for the entire simulation time. We show that the structure
SCHEMA
"... Abstractâ€”Schema matching is a basic operation of data integration, and several tools for automating it have been proposed and evaluated in the database community. Research in this area reveals that there is no single schema matcher that is guaranteed to succeed in finding a good mapping for all poss ..."
to novel techniques specific to the problem of schema matching, and to combinations of both. We provide a formal analysis of the applicability and relative performance of these algorithms and evaluate them empirically on a set of realworld schemata. Index Termsâ€”Database integration, schema matching, rank
SCHEMA
"... When we want to predict the future, we compute it from what we know about the present. Specifically, we take a mathematical representation of observed reality, plug it into some dynamical equations, and then map the timeevolved result back to real world predictions. But while this computational pro ..."
When we want to predict the future, we compute it from what we know about the present. Specifically, we take a mathematical representation of observed reality, plug it into some dynamical equations, and then map the timeevolved result back to real world predictions. But while this computational
