Results 1  10
of
17
Dynamic routing in largescale service systems with heterogeneous servers
, 2005
"... Motivated by modern call centers, we consider largescale service systems with multiple server pools and a single customer class. For such systems, we propose a simple routing rule which asymptotically minimizes the steadystate queue length and virtual waiting time. The proposed routing scheme is ..."
Abstract

Cited by 52 (12 self)
 Add to MetaCart
(Show Context)
Motivated by modern call centers, we consider largescale service systems with multiple server pools and a single customer class. For such systems, we propose a simple routing rule which asymptotically minimizes the steadystate queue length and virtual waiting time. The proposed routing scheme is FSF which assigns customers to the Fastest Servers First. The asymptotic regime considered is the HalfinWhitt manyserver heavytraffic regime, which we refer to as the Quality and Efficiency Driven (QED) regime; it achieves high levels of both service quality and system efficiency by carefully balancing between the two. Additionally, expressions are provided for system limiting performance measures based on diffusion approximations. Our analysis shows that in the QED regime this heterogeneous server system outperforms its homogeneous server counterpart.
Scheduling control for queueing systems with many servers: Asymptotic optimality in heavy traffic
, 2005
"... A multiclass queueing system is considered, with heterogeneous service stations, each consisting of many servers with identical capabilities. An optimal control problem is formulated, where the control corresponds to scheduling and routing, and the cost is a cumulative discounted functional of the s ..."
Abstract

Cited by 43 (6 self)
 Add to MetaCart
(Show Context)
A multiclass queueing system is considered, with heterogeneous service stations, each consisting of many servers with identical capabilities. An optimal control problem is formulated, where the control corresponds to scheduling and routing, and the cost is a cumulative discounted functional of the system’s state. We examine two versions of the problem: “nonpreemptive,” where service is uninterruptible, and “preemptive, ” where service to a customer can be interrupted and then resumed, possibly at a different station. We study the problem in the asymptotic heavy traffic regime proposed by Halfin and Whitt, in which the arrival rates and the number of servers at each station grow without bound. The two versions of the problem are not, in general, asymptotically equivalent in this regime, with the preemptive version showing an asymptotic behavior that is, in a sense, much simpler. Under appropriate assumptions on the structure of the system we show: (i) The value function for the preemptive problem converges to V, the value of a related diffusion control problem. (ii) The two versions of the problem are asymptotically equivalent, and in particular nonpreemptive policies can be constructed that asymptotically achieve the value V. The construction of these policies is based on a Hamilton–Jacobi–Bellman equation associated with V.
Queues with Many Servers: The Virtual WaitingTime Process in the QED Regime
, 2007
"... We consider a multiserver queue (G/GI/N) in the Quality and EfficiencyDriven (QED) regime. In this regime, which was first formalized by Halfin and Whitt, the number of servers N is not small, servers ’ utilization is 1 − O(1/√N) (EfficiencyDriven) while waiting time is O(1/ N) (QualityDriven). ..."
Abstract

Cited by 24 (1 self)
 Add to MetaCart
We consider a multiserver queue (G/GI/N) in the Quality and EfficiencyDriven (QED) regime. In this regime, which was first formalized by Halfin and Whitt, the number of servers N is not small, servers ’ utilization is 1 − O(1/√N) (EfficiencyDriven) while waiting time is O(1/ N) (QualityDriven). This is equivalent to having the number of servers N being approximately equal to R + β R, where R is the offered load and β is a positive constant. For the G/GI/N queue in the QED regime, we analyze the virtual waiting time VN (t), as N increases indefinitely. Assuming that the service time distribution has a finite support, it is shown that, in the limit, the scaled virtual waiting time V̂N (t) = NVN (t)/ES is representable as a supremum over a random weighted tree (S denotes a service time). Informally, it is then argued that, for large N,
Optimal Control of Parallel Server Systems with Many Servers in Heavy Traffic
, 2008
"... We consider a parallel server system that consists of several customer classes and server pools in parallel. We propose a simple robust control policy to minimize the total linear holding and reneging costs. We show that this policy is asymptotically optimal under the manyserver heavy traffic regi ..."
Abstract

Cited by 16 (7 self)
 Add to MetaCart
We consider a parallel server system that consists of several customer classes and server pools in parallel. We propose a simple robust control policy to minimize the total linear holding and reneging costs. We show that this policy is asymptotically optimal under the manyserver heavy traffic regime for parallel server systems when the service times are only server pool dependent and exponentially distributed.
Control of systems with flexible multiserver pools: a shadow routing approach
 QUEUEING SYST (2010 ) 66 : 1–51
, 2010
"... A general model with multiple input flows (classes) and several flexible multiserver pools is considered. We propose a robust, generic scheme for routing new arrivals, which optimally balances server pools’ loads, without the knowledge of the flow input rates and without solving any optimization pr ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
A general model with multiple input flows (classes) and several flexible multiserver pools is considered. We propose a robust, generic scheme for routing new arrivals, which optimally balances server pools’ loads, without the knowledge of the flow input rates and without solving any optimization problem. The scheme is based on Shadow routing in a virtual queueing system. We study the behavior of our scheme in the Halfin–Whitt (or, QED) asymptotic regime, when server pool sizes and the input rates are scaled up simultaneously by a factor r growing to infinity, while keeping the system load within O(√r)of its capacity. The main results are as follows. (i) We show that, in general, a system in a stationary regime has at least O ( √ r) average queue lengths, even if the so called nullcontrollability (Atar et al., Ann. Appl. Probab. 16, 1764–1804, 2006) on a finite time interval is possible; strategies achieving this O(√r) growth rate we call orderoptimal. (ii) We show that some natural algorithms, such as MaxWeight, that guarantee stability, are not orderoptimal. (iii) Under the complete resource pooling condition, we prove the diffusion limit of the arrival processes into server pools, under the Shadow routing. (We conjecture that result (iii) leads to orderoptimality of the Shadow routing algorithm; a formal proof of this fact is an important subject of future work.) Simulation results demonstrate good performance and robustness of our scheme.
Steadystate analysis of a multiserver queue in the HalfinWhitt regime
, 2008
"... We examine a multiserver queue in the HalfinWhitt (Quality and EfficiencyDriven) regime: as the number of servers n increases, the utilization approaches 1 from below at the rate Θ(1 / √ n). The arrival process is renewal and service times have a latticevalued distribution with a finite suppor ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
We examine a multiserver queue in the HalfinWhitt (Quality and EfficiencyDriven) regime: as the number of servers n increases, the utilization approaches 1 from below at the rate Θ(1 / √ n). The arrival process is renewal and service times have a latticevalued distribution with a finite support. We consider the steadystate distribution of the queue length and waiting time in the limit as the number of servers n increases indefinitely. The queue length distribution, in the limit as n → ∞, is characterized in terms of the stationary distribution of an explicitly constructed Markov chain. As a consequence, the steadystate queue length and waiting time scale as Θ ( √ n) and Θ(1 / √ n) as n → ∞, respectively. Moreover, an explicit expression for the critical exponent is derived for the moment generating function of a limiting (scaled) steadystate queue length. This exponent depends on three parameters: the amount of spare capacity and the coefficients of variation of interarrival and service times. Interestingly, it matches an analogous exponent corresponding to a singleserver queue in the conventional heavytraffic regime. The results are derived by analyzing Lyapunov functions.
Fluid Models for Overloaded MultiClass ManyServer Queueing Systems with FCFS Routing
 SUBMITTED TO MANAGEMENT SCIENCE MANUSCRIPT MS002512007
, 2007
"... Motivated by models of tenant assignment in public housing, we study approximating deterministic fluid models for overloaded queueing systems having multiple customer classes (classes of tenants) and multiple service pools (housing authorities), each with many servers (housing units). Customer aband ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
Motivated by models of tenant assignment in public housing, we study approximating deterministic fluid models for overloaded queueing systems having multiple customer classes (classes of tenants) and multiple service pools (housing authorities), each with many servers (housing units). Customer abandonment acts to keep the system stable, yielding a proper steadystate description. Motivated by fairness considerations, we assume that customers are selected for service by newly available servers on a firstcome firstserved (FCFS) basis from all classes the corresponding service pools are allowed to serve. In this context, it is challenging to determine stationary routing flow rates between customer classes and service pools. Given those routing flow rates, each single fluid queue can be analyzed separately using previously established methods. Our ability to determine the routing flow rates depends on the structure of the network routing graph. We obtain the desired routing flow rates in three cases: when the routing graph is (i) a tree (sparsely connected), (ii) complete bipartite (fully connected), and (iii) an appropriate combination of the previous two cases. Other cases remain unsolved. In the last two solved cases, the routing flow rates are actually not uniquely determined by the fluid model, but become so once we make stochastic assumptions about the queueing models that the fluid model approximates.
Queueing systems with many servers: null controllability in heavy traffic
, 2005
"... A queueing model has J ≥ 2 heterogeneous service stations, each consisting of many independent servers with identical capabilities. Customers of I ≥ 2 classes can be served at these stations at different rates, that depend on both the class and the station. A system administrator dynamically control ..."
Abstract

Cited by 11 (5 self)
 Add to MetaCart
A queueing model has J ≥ 2 heterogeneous service stations, each consisting of many independent servers with identical capabilities. Customers of I ≥ 2 classes can be served at these stations at different rates, that depend on both the class and the station. A system administrator dynamically controls scheduling and routing. We study this model in the Central Limit Theorem (or heavy traffic) regime proposed by Halfin and Whitt. We derive a diffusion model on R I with a singular control term, that describes the scaling limit of the queueing model. The singular term may be used to constrain the diffusion to lie in certain subsets of R I at all times t> 0. We say that the diffusion is nullcontrollable if it can be constrained to X−, the minimal closed subset of R I containing all states of the prelimit queueing model for which all queues are empty. We give sufficient conditions for null controllability of the diffusion. Under these conditions we also show that an analogous, asymptotic result holds for the queueing model, by constructing control policies under which, for any given 0 < ε < T < ∞, all queues in the system are kept empty on the time interval [ε, T], with probability approaching one. This introduces a new, unusual heavy traffic ‘behavior’: On one hand the system is critically loaded, in the sense that an increase in any of the external arrival rates at the ‘fluid level ’ results with an overloaded system. On the other hand, as far as queue lengths are concerned, the system behaves as if it is underloaded.
State Space Collapse in ManyServer Diffusion Limits of Parallel Server Systems
, 2006
"... We consider a class of queueing systems that consist of server pools in parallel and multiple customer classes. Customer service times are assumed to be exponentially distributed. We study the asymptotic behavior of these queueing systems in a heavy traffic regime that is known as the Halfin and Wh ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
(Show Context)
We consider a class of queueing systems that consist of server pools in parallel and multiple customer classes. Customer service times are assumed to be exponentially distributed. We study the asymptotic behavior of these queueing systems in a heavy traffic regime that is known as the Halfin and Whitt manyserver asymptotic regime. Our main contribution is a general framework for establishing state space collapse results in this regime for parallel server systems. In our work, state space collapse refers to a decrease in the dimension of the processes tracking the number of customers in each class waiting for service and the number of customers in each class being served by various server pools. We define and introduce a “state space collapse ” function, which governs the exact details of the state space collapse. We show that a state space collapse result holds in manyserver heavy traffic if a corresponding deterministic hydrodynamic model satisfies a similar state space collapse condition. Our methodology is similar in spirit to that in Bramson [10], which focuses on the conventional heavy traffic regime. We illustrate the applications of our results by establishing state space collapse results in manyserver diffusion limits of staticbufferpriority Vparallel server systems, Nmodel parallel server systems, and minimumexpecteddelay–fasterserverfirst distributed server pools systems. We show for these systems that the condition on the hydrodynamic model can easily be checked using the standard tools for fluid models.