Results 1  10
of
23
Applications of Markov Decision Processes in Communication Networks: a Survey
 in Markov Decision Processes, Models, Methods, Directions, and Open Problems, E. Feinberg and A. Shwartz (Editors) Kluwer
, 2001
"... We present in this Chapter a survey on applications of MDPs to communication networks. We survey both the different applications areas in communication networks as well as the theoretical tools that have been developed to model and to solve the resulting control problems. 1 ..."
Abstract

Cited by 27 (2 self)
 Add to MetaCart
(Show Context)
We present in this Chapter a survey on applications of MDPs to communication networks. We survey both the different applications areas in communication networks as well as the theoretical tools that have been developed to model and to solve the resulting control problems. 1
Burstlevel Congestion Control Using Hindsight Optimization
, 2000
"... We consider the burstlevel congestioncontrol problem in a communication network with multiple traffic sources, each modeled as a fullycontrollable stream of fluid traffic. The controlled traffic shares a common bottleneck node with highpriority cross traffic described by a Markovmodulated fluid ..."
Abstract

Cited by 19 (6 self)
 Add to MetaCart
We consider the burstlevel congestioncontrol problem in a communication network with multiple traffic sources, each modeled as a fullycontrollable stream of fluid traffic. The controlled traffic shares a common bottleneck node with highpriority cross traffic described by a Markovmodulated fluid (MMF). Each controlled source is assumed to have a unique roundtrip delay. The goal is to maximize a linear combination of the throughput, delay, traffic loss rate, and a fairness metric at the bottleneck node. We introduce a simulationbased congestioncontrol scheme capable of performing effectively under rapidlyvarying cross traffic by making use of the provided MMF model of that variation. In our scheme, the control problem is posed as a finitehorizon Markov decision process and is solved heuristically using a technique called Hindsight Optimization. We provide a detailed derivation of our congestioncontrol algorithm based on this technique. Our empirical study shows that the control scheme performs sign...
Optimality of Monotonic Policies for TwoAction Markovian Decision Processes, with Applications to Control of Queues with Delayed Information
, 1995
"... We consider a discretetime Markov decision process with a partially ordered state space and two feasible control actions in each state. Our goal is to find general conditions, which are satisfied in a broad class of applications to control of queues, under which an optimal control policy is monoton ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
We consider a discretetime Markov decision process with a partially ordered state space and two feasible control actions in each state. Our goal is to find general conditions, which are satisfied in a broad class of applications to control of queues, under which an optimal control policy is monotonic. An advantage of our approach is that it easily extends to problems with both information and action delays, which are common in applications to highspeed communication networks, among others. The transition probabilities are stochastically monotone and the onestage reward submodular. We further assume that transitions from different states are coupled, in the sense that the state after a transition is distributed as a deterministic function of the current state and two random variables, one of which is controllable and the other uncontrollable. Finally, we make a monotonicity assumption about the samplepath effect of a pairwise switch of the actions in consecutive stages. Using induct...
Optimal batch service of a polling system under partial informtion
 Methods and Models in OR
, 1996
"... Abstract: We consider the optimal scheduling of an infinite.capacity batch server in a Nnode ring queueing network, where the controller observes only the length of the queue at which the server is located. For a cost criterion that includes linear holding costs, fixed dispatching costs, and linear ..."
Abstract

Cited by 11 (0 self)
 Add to MetaCart
Abstract: We consider the optimal scheduling of an infinite.capacity batch server in a Nnode ring queueing network, where the controller observes only the length of the queue at which the server is located. For a cost criterion that includes linear holding costs, fixed dispatching costs, and linear service rewards, we prove optimality and monotonicity of threshold scheduling policies. 1
OnLine SamplingBased Control For Network Queueing Problems
, 2001
"... : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ix 1. ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ix 1.
Control of a Random Walk with Noisy Delayed Information
, 1995
"... We consider the control of a random walk on the nonnegative integers. The controller has two actions. It makes decisions based on noisy information on the current state but on full information on previous states and actions. We establish the optimality of a threshold policy, where the threshold dep ..."
Abstract

Cited by 8 (6 self)
 Add to MetaCart
We consider the control of a random walk on the nonnegative integers. The controller has two actions. It makes decisions based on noisy information on the current state but on full information on previous states and actions. We establish the optimality of a threshold policy, where the threshold depends on the last action, and the noisy information. We apply the result to flow and service control problems.
Congestion Control via Online Sampling
, 2001
"... We consider the congestioncontrol problem in a communication network with multiple traffic sources, each modeled as a fullycontrollable stream of fluid traffic. The controlled traffic shares a common bottleneck node with highpriority cross traffic described by a Markovmodulated fluid (MMF). Each c ..."
Abstract

Cited by 8 (2 self)
 Add to MetaCart
(Show Context)
We consider the congestioncontrol problem in a communication network with multiple traffic sources, each modeled as a fullycontrollable stream of fluid traffic. The controlled traffic shares a common bottleneck node with highpriority cross traffic described by a Markovmodulated fluid (MMF). Each controlled source is assumed to have a unique roundtrip delay. We wish to maximize a linear combination of the throughput, delay, traffic loss rate, and a fairness metric at the bottleneck node. We introduce an online samplingbased burstlevel congestioncontrol scheme capable of performing effectively under rapidlyvarying cross traffic by making explicit use of the provided MMF model of that variation. The control problem is posed as a finitehorizon Markov decision process and is solved heuristically using a technique called Hindsight Optimization. We provide a detailed derivation of our congestioncontrol algorithm based on this technique. The distinguishing feature of our scheme relative to conventional congestioncontrol schemes is that we exploit a stochastic model of the cross traffic. Our empirical study shows that our control scheme significantly outperforms the conventional proportionalderivative (PD) controller, achieving higher utlization, lower delay, and lower loss under reasonable fairness. The performance advantage of our scheme over the PD scheme grows as the rate variance of cross traffic increases, underscoring the effectiveness of our control scheme under variable cross traffic.
Planning and Learning in Environments with Delayed Feedback
"... Abstract. This work considers the problems of planning and learning in environments with constant observation and reward delays. We provide a hardness result for the general planning problem and positive results for several special cases with deterministic or otherwise constrained dynamics. We prese ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
(Show Context)
Abstract. This work considers the problems of planning and learning in environments with constant observation and reward delays. We provide a hardness result for the general planning problem and positive results for several special cases with deterministic or otherwise constrained dynamics. We present an algorithm, Model Based Simulation, for planning in such environments and use modelbased reinforcement learning to extend this approach to the learning setting in both finite and continuous environments. Empirical comparisons show this algorithm holds significant advantages over others for decision making in delayed environments. 1