Results 1 - 10
of
23
Applications of Markov Decision Processes in Communication Networks: a Survey
- in Markov Decision Processes, Models, Methods, Directions, and Open Problems, E. Feinberg and A. Shwartz (Editors) Kluwer
, 2001
"... We present in this Chapter a survey on applications of MDPs to communication networks. We survey both the different applications areas in communication networks as well as the theoretical tools that have been developed to model and to solve the resulting control problems. 1 ..."
Abstract
-
Cited by 27 (2 self)
- Add to MetaCart
(Show Context)
We present in this Chapter a survey on applications of MDPs to communication networks. We survey both the different applications areas in communication networks as well as the theoretical tools that have been developed to model and to solve the resulting control problems. 1
Burst-level Congestion Control Using Hindsight Optimization
, 2000
"... We consider the burst-level congestion-control problem in a communication network with multiple traffic sources, each modeled as a fully-controllable stream of fluid traffic. The controlled traffic shares a common bottleneck node with high-priority cross traffic described by a Markov-modulated fluid ..."
Abstract
-
Cited by 19 (6 self)
- Add to MetaCart
We consider the burst-level congestion-control problem in a communication network with multiple traffic sources, each modeled as a fully-controllable stream of fluid traffic. The controlled traffic shares a common bottleneck node with high-priority cross traffic described by a Markov-modulated fluid (MMF). Each controlled source is assumed to have a unique round-trip delay. The goal is to maximize a linear combination of the throughput, delay, traffic loss rate, and a fairness metric at the bottleneck node. We introduce a simulation-based congestion-control scheme capable of performing effectively under rapidly-varying cross traffic by making use of the provided MMF model of that variation. In our scheme, the control problem is posed as a finite-horizon Markov decision process and is solved heuristically using a technique called Hindsight Optimization. We provide a detailed derivation of our congestion-control algorithm based on this technique. Our empirical study shows that the control scheme performs sign...
Optimality of Monotonic Policies for Two-Action Markovian Decision Processes, with Applications to Control of Queues with Delayed Information
, 1995
"... We consider a discrete-time Markov decision process with a partially ordered state space and two feasible control actions in each state. Our goal is to find general conditions, which are satisfied in a broad class of applications to control of queues, under which an optimal control policy is monoton ..."
Abstract
-
Cited by 13 (4 self)
- Add to MetaCart
We consider a discrete-time Markov decision process with a partially ordered state space and two feasible control actions in each state. Our goal is to find general conditions, which are satisfied in a broad class of applications to control of queues, under which an optimal control policy is monotonic. An advantage of our approach is that it easily extends to problems with both information and action delays, which are common in applications to high-speed communication networks, among others. The transition probabilities are stochastically monotone and the one-stage reward submodular. We further assume that transitions from different states are coupled, in the sense that the state after a transition is distributed as a deterministic function of the current state and two random variables, one of which is controllable and the other uncontrollable. Finally, we make a monotonicity assumption about the sample-path effect of a pairwise switch of the actions in consecutive stages. Using induct...
Optimal batch service of a polling system under partial informtion
- Methods and Models in OR
, 1996
"... Abstract: We consider the optimal scheduling of an infinite.capacity batch server in a N-node ring queueing network, where the controller observes only the length of the queue at which the server is located. For a cost criterion that includes linear holding costs, fixed dispatching costs, and linear ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Abstract: We consider the optimal scheduling of an infinite.capacity batch server in a N-node ring queueing network, where the controller observes only the length of the queue at which the server is located. For a cost criterion that includes linear holding costs, fixed dispatching costs, and linear service rewards, we prove optimality and monotonicity of threshold scheduling policies. 1
On-Line Sampling-Based Control For Network Queueing Problems
, 2001
"... : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ix 1. ..."
Abstract
-
Cited by 9 (5 self)
- Add to MetaCart
: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : ix 1.
Control of a Random Walk with Noisy Delayed Information
, 1995
"... We consider the control of a random walk on the nonnegative integers. The controller has two actions. It makes decisions based on noisy information on the current state but on full information on previous states and actions. We establish the optimality of a threshold policy, where the threshold dep ..."
Abstract
-
Cited by 8 (6 self)
- Add to MetaCart
We consider the control of a random walk on the nonnegative integers. The controller has two actions. It makes decisions based on noisy information on the current state but on full information on previous states and actions. We establish the optimality of a threshold policy, where the threshold depends on the last action, and the noisy information. We apply the result to flow and service control problems.
Congestion Control via Online Sampling
, 2001
"... We consider the congestion-control problem in a communication network with multiple traffic sources, each modeled as a fullycontrollable stream of fluid traffic. The controlled traffic shares a common bottleneck node with high-priority cross traffic described by a Markovmodulated fluid (MMF). Each c ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
(Show Context)
We consider the congestion-control problem in a communication network with multiple traffic sources, each modeled as a fullycontrollable stream of fluid traffic. The controlled traffic shares a common bottleneck node with high-priority cross traffic described by a Markovmodulated fluid (MMF). Each controlled source is assumed to have a unique round-trip delay. We wish to maximize a linear combination of the throughput, delay, traffic loss rate, and a fairness metric at the bottleneck node. We introduce an online sampling-based burst-level congestioncontrol scheme capable of performing effectively under rapidly-varying cross traffic by making explicit use of the provided MMF model of that variation. The control problem is posed as a finite-horizon Markov decision process and is solved heuristically using a technique called Hindsight Optimization. We provide a detailed derivation of our congestion-control algorithm based on this technique. The distinguishing feature of our scheme relative to conventional congestion-control schemes is that we exploit a stochastic model of the cross traffic. Our empirical study shows that our control scheme significantly outperforms the conventional proportionalderivative (PD) controller, achieving higher utlization, lower delay, and lower loss under reasonable fairness. The performance advantage of our scheme over the PD scheme grows as the rate variance of cross traffic increases, underscoring the effectiveness of our control scheme under variable cross traffic.
Planning and Learning in Environments with Delayed Feedback
"... Abstract. This work considers the problems of planning and learning in environments with constant observation and reward delays. We provide a hardness result for the general planning problem and positive results for several special cases with deterministic or otherwise constrained dynamics. We prese ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
Abstract. This work considers the problems of planning and learning in environments with constant observation and reward delays. We provide a hardness result for the general planning problem and positive results for several special cases with deterministic or otherwise constrained dynamics. We present an algorithm, Model Based Simulation, for planning in such environments and use model-based reinforcement learning to extend this approach to the learning setting in both finite and continuous environments. Empirical comparisons show this algorithm holds significant advantages over others for decision making in delayed environments. 1