Results 11  20
of
137
Indexability of Restless Bandit Problems and Optimality of Whittle's Index for Dynamic . . .
"... We consider a class of restless multiarmed bandit problems (RMBP) that arises in dynamic multichannel access, user/server scheduling, and optimal activation in multiagent systems. For this class of RMBP, we establish the indexability and obtain Whittle’s index in closedform for both discounted an ..."
Abstract

Cited by 59 (13 self)
 Add to MetaCart
We consider a class of restless multiarmed bandit problems (RMBP) that arises in dynamic multichannel access, user/server scheduling, and optimal activation in multiagent systems. For this class of RMBP, we establish the indexability and obtain Whittle’s index in closedform for both discounted and average reward criteria. These results lead to a direct implementation of Whittle’s index policy with remarkably low complexity. When arms are stochastically identical, we show that Whittle’s index policy is optimal under certain conditions. Furthermore, it has a semiuniversal structure that obviates the need to know the Markov transition probabilities. The optimality and the semiuniversal structure result from the equivalency between Whittle’s index policy and the myopic policy established in this work. For nonidentical arms, we develop efficient algorithms for computing a performance upper bound given by Lagrangian relaxation. The tightness of the upper bound and the nearoptimal performance of Whittle’s index policy are illustrated with simulation examples.
Optimal SensingTransmission Structure for Dynamic Spectrum Access. available at http : //www.cs.ucdavis.edu/ liu/preprint/huangsensing.pdf
"... Abstract—In cognitive wireless networks where secondary users (SUs) opportunistically access spectral white spaces of primary users (PUs), there exists an inherent tradeoff between sensing and transmission due to the competing goals of PU protection and SU access maximization. This paper studies mea ..."
Abstract

Cited by 45 (6 self)
 Add to MetaCart
(Show Context)
Abstract—In cognitive wireless networks where secondary users (SUs) opportunistically access spectral white spaces of primary users (PUs), there exists an inherent tradeoff between sensing and transmission due to the competing goals of PU protection and SU access maximization. This paper studies means of sensingtransmission for SUs to better manage the competing goals by defining utility function to reward the SU for successful packet transmissions and to penalize it for colliding with PU. To maximize the SU utility, we present a thresholdbased sensingtransmission structure that is optimal under a technical constraint. Both perfect sensing and imperfect sensing are considered, with or without SU acknowledgement of reception. This SU access scheme optimizes SU access efficiency while protecting PU performance. It sets a benchmark and provides insight for the design of sensingtransmission control in cognitive networks such as IEEE 802.22. I.
Power Control in Cognitive Radio Networks: How to Cross a MultiLane Highway
"... We consider power control in cognitive radio networks where secondary users identify and exploit instantaneous and local spectrum opportunities without causing unacceptable interference to primary users. We qualitatively characterize the impact of the transmission power of secondary users on the occ ..."
Abstract

Cited by 39 (5 self)
 Add to MetaCart
(Show Context)
We consider power control in cognitive radio networks where secondary users identify and exploit instantaneous and local spectrum opportunities without causing unacceptable interference to primary users. We qualitatively characterize the impact of the transmission power of secondary users on the occurrence of spectrum opportunities and the reliability of opportunity detection. Based on a Poisson model of the primary network, we quantify these impacts by showing that (i) the probability of spectrum opportunity decreases exponentially with respect to the transmission power of secondary users, where the exponential decay constant is given by the traffic load of primary users; (ii) reliable opportunity detection is achieved in the two extreme regimes in terms of the ratio between the transmission power of secondary users and that of primary users. Such analytical characterizations allow us to study power control for optimal transport throughput under constraints on the interference to primary users. Furthermore, we reveal the difference between detecting primary signals and detecting spectrum opportunities, and demonstrate the complex relationship between physical layer spectrum sensing and MAC layer throughput. The dependency of this PHYMAC interaction on the application type and the use of handshake signaling such as RTS/CTS is illustrated.
Learning Multiuser Channel Allocations in Cognitive Radio Networks: A Combinatorial MultiArmed Bandit Formulation
"... Abstract—We consider the following fundamental problem in the context of channelized dynamic spectrum access. There are M secondary users and N ≥ M orthogonal channels. Each secondary user requires a single channel for operation that does not conflict with the channels assigned to the other users. D ..."
Abstract

Cited by 39 (6 self)
 Add to MetaCart
(Show Context)
Abstract—We consider the following fundamental problem in the context of channelized dynamic spectrum access. There are M secondary users and N ≥ M orthogonal channels. Each secondary user requires a single channel for operation that does not conflict with the channels assigned to the other users. Due to geographic dispersion, each secondary user can potentially see different primary user occupancy behavior on each channel. Time is divided into discrete decision rounds. The throughput obtainable from spectrum opportunities on each userchannel combination over a decision period is modeled as an arbitrarilydistributed random variable with bounded support but unknown mean, i.i.d. over time. The objective is to search for an allocation of channels for all users that maximizes the expected sum throughput. We formulate this problem as a combinatorial multiarmed bandit (MAB), in which each arm corresponds to a matching of the users to channels. Unlike most prior work on multiarmed bandits, this combinatorial formulation results in dependent arms. Moreover, the number of arms grows superexponentially as the permutation P (N, M). We present a novel matchinglearning algorithm with polynomial storage and polynomial computation per decision period for this problem, and prove that it results in a regret (the gap between the expected sumthroughput obtained by a genieaided perfect allocation and that obtained by this algorithm) that is uniformly upperbounded for all time n by a function that grows as O(M 4 Nlogn), i.e. polynomial in the number of unknown parameters and logarithmic in time. We also discuss how our results provide a nontrivial generalization of known theoretical results on multiarmed bandits. I.
Throughputefficient Sequential Channel Sensing and Probing in Cognitive Radio Networks Under Sensing Errors
 In Proc. ACM MobiCom
, 2009
"... In this paper, we exploit channel diversity for opportunistic spectrum access (OSA). Our approach uses channel quality as a second criterion (along with the idle/busy status of the channel) in selecting channels to use for opportunistic transmission. The difficulty of the problem comes from the fac ..."
Abstract

Cited by 38 (2 self)
 Add to MetaCart
(Show Context)
In this paper, we exploit channel diversity for opportunistic spectrum access (OSA). Our approach uses channel quality as a second criterion (along with the idle/busy status of the channel) in selecting channels to use for opportunistic transmission. The difficulty of the problem comes from the fact that it is practically infeasible for a CR to first scan all channels and then pick the best among them, due to the potentially large number of channels open to OSA and the limited power/hardware capability of a CR. As a result, the CR can only sense and probe channels sequentially. To avoid collisions with other CRs, after sensing and probing a channel, the CR needs to make a decision on whether to terminate the scan and use the underlying channel or to skip it and scan the next one. The optimal useorskip decision strategy that maximizes the CR’s average throughput is one of our primary concerns in this study. This problem is further complicated by practical considerations, such as sensing/probing overhead and sensing errors. An optimal decision strategy that addresses all the above considerations is derived by formulating the sequential sensing/probing process as a rateofreturn problem, which we solve using optimal stopping theory. We further explore the special structure of this strategy to conduct a “secondround ” optimization over the operational parameters, such as the sensing and probing times. We show through simulations that significant throughput gains (e.g., about 100%) are achieved using our joint sensing/probing scheme over the conventional one that uses sensing alone.
Algorithms for dynamic spectrum access with learning for cognitive radio
 IEEE Transactions on Signal Processing
, 2010
"... We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooperatively tries to exploit vacancies in some primary (licensed) channels whose occupancies follow a Markovian evolution. We f ..."
Abstract

Cited by 36 (2 self)
 Add to MetaCart
(Show Context)
We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooperatively tries to exploit vacancies in some primary (licensed) channels whose occupancies follow a Markovian evolution. We first consider the scenario where the cognitive users have perfect knowledge of the distribution of the signals they receive from the primary users. For this problem, we obtain a greedy channel selection and access policy that maximizes the instantaneous reward, while satisfying a constraint on the probability of interfering with licensed transmissions. We also derive an analytical universal upper bound on the performance of the optimal policy. Through simulation, we show that our scheme achieves good performance relative to the upper bound and substantial improvement relative to an existing scheme. We then consider the more practical scenario where the exact distribution of the signal from the primary is unknown. We assume a parametric model for the distribution and develop an algorithm that can learn the true distribution, still guaranteeing the constraint on the interference probability. We show
Decentralized Dynamic Spectrum Access for Cognitive Radios: Cooperative Design of a Noncooperative Game
 IEEE TRANSACTIONS ON MOBILE COMUPTING
, 2009
"... We consider dynamic spectrum access among cognitive radios from an adaptive, game theoretic learning perspective. Spectrumagile cognitive radios compete for channels temporarily vacated by licensed primary users in order to satisfy their own demands while minimizing interference. For both slowly v ..."
Abstract

Cited by 28 (2 self)
 Add to MetaCart
(Show Context)
We consider dynamic spectrum access among cognitive radios from an adaptive, game theoretic learning perspective. Spectrumagile cognitive radios compete for channels temporarily vacated by licensed primary users in order to satisfy their own demands while minimizing interference. For both slowly varying primary user activity and slowly varying statistics of “fast” primary user activity, we apply an adaptive regret based learning procedure which tracks the set of correlated equilibria of the game, treated as a distributed stochastic approximation. This procedure is shown to perform very well compared with other similar adaptive algorithms. We also estimate channel contention for a simple CSMA channel sharing scheme.
A DECISIONTHEORETIC FRAMEWORK FOR OPPORTUNISTIC SPECTRUM ACCESS
"... The authors identify basic components, fundamental tradeoffs, and practical constraints in opportunistic spectrum access. A decisiontheoretic framework based on the theory of partially observable Markov decision processes is introduced. ..."
Abstract

Cited by 26 (3 self)
 Add to MetaCart
The authors identify basic components, fundamental tradeoffs, and practical constraints in opportunistic spectrum access. A decisiontheoretic framework based on the theory of partially observable Markov decision processes is introduced.
Optimal Cognitive Access of Markovian Channels under Tight Collision Constraints
"... Abstract—The problem of cognitive access of channels of primary users by a secondary user is considered. The transmissions of primary users are modeled as independent continuoustime Markovian onoff processes. A secondary cognitive user employs a slotted transmission format, and it senses one of th ..."
Abstract

Cited by 23 (7 self)
 Add to MetaCart
(Show Context)
Abstract—The problem of cognitive access of channels of primary users by a secondary user is considered. The transmissions of primary users are modeled as independent continuoustime Markovian onoff processes. A secondary cognitive user employs a slotted transmission format, and it senses one of the possible channels before transmission. The objective of the cognitive user is to maximize its throughput subject to collision constraints imposed by the primary users. The optimal access strategy is in general a solution of a constrained partially observable Markov decision process, which involves a constrained optimization in an infinite dimensional functional space. It is shown in this paper that, when the collision constraints are tight, the optimal access strategy can be implemented by a simple memoryless access policy with periodic channel sensing. Analytical expressions are given for the thresholds on collision probabilities for which memoryless access performs optimally. Extensions to multiple secondary users are also presented. Numerical and theoretical results are presented to validate and extend the analysis for different practical scenarios. Index Terms—Cognitive radio, Dynamic spectrum allocation, Cognitive medium access, Markov decision processes.
Structure and optimality of myopic sensing for opportunistic spectrum access
 in Proc. of IEEE Workshop on Towards Cognition in Wireless Networks(CogNet
, 2007
"... We consider opportunistic spectrum access for secondary users over multiple channels whose occupancy by primary users is modeled as discretetime Markov processes. Due to hardware limitations and energy constraints, a secondary user can choose, in each slot, one channel to sense and decide whether t ..."
Abstract

Cited by 21 (15 self)
 Add to MetaCart
(Show Context)
We consider opportunistic spectrum access for secondary users over multiple channels whose occupancy by primary users is modeled as discretetime Markov processes. Due to hardware limitations and energy constraints, a secondary user can choose, in each slot, one channel to sense and decide whether to access based on the sensing outcome. The design of sensing strategies that govern channel selections in each slot for optimal throughput performance of the secondary user can be formulated as a partially observable Markov decision process (POMDP). We exploit the structure of this problem when channels are independently and identically distributed. We reveal that the myopic sensing policy has a simple structure: channel selection is reduced to a counting process with little complexity. Further, for the twochannel case, we prove that the myopic sensing policy is in fact the optimal policy. Numerical results have also demonstrated the optimality of the myopic sensing policy when there are more than two channels.