Results 1  10
of
3,019
Finitetime analysis of the multiarmed bandit problem
 Machine Learning
, 2002
"... Abstract. Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while taking the empirically best action as often as possible. A popular measure of a policy’s success in addressing ..."
Abstract

Cited by 817 (15 self)
 Add to MetaCart
, and for all reward distributions with bounded support. Keywords: bandit problems, adaptive allocation rules, finite horizon regret 1.
Constrained model predictive control: Stability and optimality
 AUTOMATICA
, 2000
"... Model predictive control is a form of control in which the current control action is obtained by solving, at each sampling instant, a finite horizon openloop optimal control problem, using the current state of the plant as the initial state; the optimization yields an optimal control sequence and t ..."
Abstract

Cited by 738 (16 self)
 Add to MetaCart
Model predictive control is a form of control in which the current control action is obtained by solving, at each sampling instant, a finite horizon openloop optimal control problem, using the current state of the plant as the initial state; the optimization yields an optimal control sequence
The Complexity of Decentralized Control of Markov Decision Processes
 Mathematics of Operations Research
, 2000
"... We consider decentralized control of Markov decision processes and give complexity bounds on the worstcase running time for algorithms that find optimal solutions. Generalizations of both the fullyobservable case and the partiallyobservable case that allow for decentralized control are described. ..."
Abstract

Cited by 411 (46 self)
 Add to MetaCart
. For even two agents, the finitehorizon problems corresponding to both of these models are hard for nondeterministic exponential time. These complexity results illustrate a fundamental difference between centralized and decentralized control of Markov decision processes. In contrast to the problems
Weighted finitestate transducers in speech recognition
 COMPUTER SPEECH & LANGUAGE
, 2002
"... We survey the use of weighted finitestate transducers (WFSTs) in speech recognition. We show that WFSTs provide a common and natural representation for hidden Markov models (HMMs), contextdependency, pronunciation dictionaries, grammars, and alternative recognition outputs. Furthermore, general tr ..."
Abstract

Cited by 211 (5 self)
 Add to MetaCart
We survey the use of weighted finitestate transducers (WFSTs) in speech recognition. We show that WFSTs provide a common and natural representation for hidden Markov models (HMMs), contextdependency, pronunciation dictionaries, grammars, and alternative recognition outputs. Furthermore, general
Dynamic RealTime Deformations using Space Time Adaptive Sampling
, 2001
"... This paper presents a robust, adaptive method for animating dynamic viscoelastic deformable objects that provides a guaranteed frame rate. Our approach uses a novel automatic space and time adaptive level of detail technique, in combination with a largedisplacement (Green) strain tensor formulation ..."
Abstract

Cited by 226 (14 self)
 Add to MetaCart
massspring and other adaptive approaches. In particular, damped elastic vibration modes are shown to be nearly unchanged for several levels of refinement. Results are presented in the context of a virtual reality system. The user interacts in realtime with the dynamic object through the control of a
Stability of switched systems with average dwelltime
 In Proc. 38th IEEE Conf. on Decision and Control
, 1999
"... It is shown that switching among stable linear systems results in a stable system provided that switching is “slowontheaverage. ” In particular, it is proved that exponential stability is achieved when the number of switches in any finite interval grows linearly with the length of the interval, a ..."
Abstract

Cited by 219 (26 self)
 Add to MetaCart
It is shown that switching among stable linear systems results in a stable system provided that switching is “slowontheaverage. ” In particular, it is proved that exponential stability is achieved when the number of switches in any finite interval grows linearly with the length of the interval
Finite State Automata and Simple Recurrent Networks
"... Figurel: The simple recurrent network (Elman 1988). In the SRN, the pattern of activation on the hidden units at time step t 1, together with the new input pattern, is allowed to influence the pattern of activation at time step t. This is achieved by copying the pattern of activation on the hidden ..."
Abstract

Cited by 166 (10 self)
 Add to MetaCart
layer at time step 1 to a set of input units called the "context units at time step t. All the forward connections in the network are subject to training via backpropagation. moving window " paradigms or algorithms such as backpropagation in time (Rumelhart et al. 1986; Williams
Learning the structure of event sequences
 JOURNAL OF EXPERIMENTAL PSYCHOLOGY: GENERAL
, 1991
"... How is complex sequential material acquired, processed, and represented when there is no intention to learn? Two experiments exploring a choice reaction time task are reported. Unknown to Ss, successive stimuli followed a sequence derived from a "noisy " finitestate grammar. After ..."
Abstract

Cited by 218 (28 self)
 Add to MetaCart
How is complex sequential material acquired, processed, and represented when there is no intention to learn? Two experiments exploring a choice reaction time task are reported. Unknown to Ss, successive stimuli followed a sequence derived from a "noisy " finitestate grammar
Corporate Investment and Asset Price Dynamics: Implications for the CrossSection of Returns
 Journal of Finance
, 2004
"... We show that corporate investment decisions can explain conditional dynamics in expected asset returns. Our approach is similar in spirit to Berk, Green, and Naik (1999), but we introduce to the investment problem operating leverage, reversible real options, fixed adjustment costs, and finite growth ..."
Abstract

Cited by 207 (8 self)
 Add to MetaCart
the model using simulation methods and reproduce portfolio excess returns comparable to the data. Corporate investment decisions are often evaluated in a real options context, 1 and option exercise can change the riskiness of a firm in various ways. For example, if growth opportunities are finite
Artificial Agents and Speculative Bubbles
"... Pertaining to Agentbased Computational Economics (ACE), this work presents two models for the rise and downfall of speculative bubbles through an exchange price fixing based on a double auction mechanism. The first model is based in a finite time horizon context where total expected dividends decre ..."
Abstract
 Add to MetaCart
Pertaining to Agentbased Computational Economics (ACE), this work presents two models for the rise and downfall of speculative bubbles through an exchange price fixing based on a double auction mechanism. The first model is based in a finite time horizon context where total expected dividends
Results 1  10
of
3,019