Results 11  20
of
103
1 Opportunistic Spectrum Access in Cognitive Radio Networks: When to Turn off the Spectrum Sensors
"... Abstract — In cognitive radio networks, spectrum sensing is a critical to both protecting the primary users and creating spectrum access opportunities of secondary users. Channel sensing itself, including active probing and passive listening, often incurs cost, in terms of time overhead, energy cons ..."
Abstract

Cited by 14 (2 self)
 Add to MetaCart
Abstract — In cognitive radio networks, spectrum sensing is a critical to both protecting the primary users and creating spectrum access opportunities of secondary users. Channel sensing itself, including active probing and passive listening, often incurs cost, in terms of time overhead, energy consumption, or intrusion to primary users. It is thus not desirable to sense the channel arbitrarily. In this paper, we are motivated to consider the following problem. A secondary user, equipped with spectrum sensors, dynamically accesses a channel. If it transmits without/with colliding with primary users, a certain reward/penalty is obtained. If it senses the channel, accurate channel information is obtained, but a given channel sensing cost incurs. The third option for the user is to turn off the sensor/transmitter and go to sleep mode, where no cost/gain incurs. So when should the secondary user transmit, sense, or sleep, to maximize the total gain? We derive the optimal transmitting, sensing, and sleeping structure, which is a thresholdbased policy. Our work sheds light on designing sensing and transmitting scheduling protocols for cognitive radio networks, especially the inband sensing mechanism in 802.22 networks. I.
A new perspective on multiuser power control games in interference channels
 Transactions on Wireless Communications
"... ..."
(Show Context)
Dynamic potentialbased reward shaping
 in Proceedings of The Eleventh Annual International Conference on Autonomous Agents and Multiagent Systems
, 2012
"... Potentialbased reward shaping can significantly improve the time needed to learn an optimal policy and, in multiagent systems, the performance of the final jointpolicy. It has been proven to not alter the optimal policy of an agent learning alone or the Nash equilibria of multiple agents learning ..."
Abstract

Cited by 13 (7 self)
 Add to MetaCart
(Show Context)
Potentialbased reward shaping can significantly improve the time needed to learn an optimal policy and, in multiagent systems, the performance of the final jointpolicy. It has been proven to not alter the optimal policy of an agent learning alone or the Nash equilibria of multiple agents learning together. However, a limitation of existing proofs is the assumption that the potential of a state does not change dynamically during the learning. This assumption often is broken, especially if the rewardshaping function is generated automatically. In this paper we prove and demonstrate a method of extending potentialbased reward shaping to allow dynamic shaping and maintain the guarantees of policy invariance in the singleagent case and consistent Nash equilibria in the multiagent case.
Switching Dynamics of MultiAgent Learning
"... This paper presents the dynamics of multiagent reinforcement learning in multiple state problems. We extend previous work that formally modelled the relation between reinforcement learning agents and replicator dynamics in stateless multiagent games. More precisely, in this work we use a combinati ..."
Abstract

Cited by 9 (5 self)
 Add to MetaCart
(Show Context)
This paper presents the dynamics of multiagent reinforcement learning in multiple state problems. We extend previous work that formally modelled the relation between reinforcement learning agents and replicator dynamics in stateless multiagent games. More precisely, in this work we use a combination of replicator dynamics and switching dynamics to model multiagent learning automata in multistate games. This is the first time that the dynamics of problems with more than one state is considered with replicator equations. Previously, it was unclear how the replicator dynamics of stateless games had to be extended to account for multiple states. We use our model to visualize the basin of attraction of the learning agents and the boundaries of switching dynamics at which an agent possibly arrives in a new dynamical system. Our model allows to analyze and predict the behavior of the different learning agents in a wide variety of multistate problems. In our experiments we illustrate this powerful method in two games with two agents and two states.
Multiagent Learning Experiments on Repeated Matrix Games
"... This paper experimentally evaluates multiagent learning algorithms playing repeated matrix games to maximize their cumulative return. Previous works assessed that Qlearning surpassed Nashbased multiagent learning algorithms. Based on allagainstall repeated matrix game tournaments, this paper upd ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
(Show Context)
This paper experimentally evaluates multiagent learning algorithms playing repeated matrix games to maximize their cumulative return. Previous works assessed that Qlearning surpassed Nashbased multiagent learning algorithms. Based on allagainstall repeated matrix game tournaments, this paper updates the state of the art of multiagent learning experiments. In a first stage, it shows that MQubed, S and banditbased algorithms such as UCB are the best algorithms on generalsum games, Exp3 being the best on cooperative games and zerosum games. In a second stage, our experiments show that two features forgetting the far past, and using recent history with statesimprove the learning algorithms. Finally, the best algorithms are two new algorithms, Qlearning and UCB enhanced with the two features, and MQubed. 1.
Robust and Scalable Coordination of PotentialField Driven Agents
 In Proceedings of IAWTIC/CIMCA 2006
, 2006
"... In this paper, we introduce a natureinspired multiagent system for the task domain of resource distribution in large storage facilities. The system is based on potential fields and swarm intelligence, in which straightforward path planning is integrated. We show both experimentally and theoretica ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
(Show Context)
In this paper, we introduce a natureinspired multiagent system for the task domain of resource distribution in large storage facilities. The system is based on potential fields and swarm intelligence, in which straightforward path planning is integrated. We show both experimentally and theoretically that the system is adaptive, robust and scalable. Moreover, we show that the planning component helps to overcome common pitfalls for natureinspired systems in the task assignment domain. We end this paper with a discussion of an additional requirement for multiagent systems interacting with humans: functionality. More precisely, we argue that such systems must behave in a fair way to be functional. We illustrate how fairness can be measured and illustrate that our system behaves in a moderately fair manner. 1
Fairness in MultiAgent Systems
"... multiagent systems is becoming more and more important [1, 2]. Multiagent systems are generally accepted as valuable tools for designing and building distributed dynamical systems, by using several interacting agents, possibly including humans. In practice, multiagent systems are often performing ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
(Show Context)
multiagent systems is becoming more and more important [1, 2]. Multiagent systems are generally accepted as valuable tools for designing and building distributed dynamical systems, by using several interacting agents, possibly including humans. In practice, multiagent systems are often performing tasks in cooperation with, or instead of humans. Examples include software agents participating in online auctions or bargaining [3, 4], electronic institutions [5], developing schedules for air traffic [6] and decentralized resource distribution in large storage facilities [7, 8]. Although multiagent systems have many potential advantages, designing them raises many difficulties. One of the key problems lies in controlling the behavior of individual agents in such a way that the system as a whole reaches a certain goal. This problem becomes even more prominent in multiagent systems that interact with humans. Usually, multiagent systems are designed assuming
Robust solutions to stackelberg games: Addressing bounded rationality and limited observations in human cognition
 Artificial Intelligence Journal
, 2010
"... Stackelberg games represent an important class of games in which one player, the leader, commits to a strategy and the remaining players, the followers, make their decision with knowledge of the leader’s commitment. Existing algorithms for Bayesian Stackelberg games find optimal solutions while mode ..."
Abstract

Cited by 5 (5 self)
 Add to MetaCart
Stackelberg games represent an important class of games in which one player, the leader, commits to a strategy and the remaining players, the followers, make their decision with knowledge of the leader’s commitment. Existing algorithms for Bayesian Stackelberg games find optimal solutions while modeling uncertainty over follower types with an apriori probability distribution. Unfortunately, in realworld applications, the leader may also face uncertainty over the follower’s response which makes the optimality guarantees of these algorithms fail. Such uncertainty arises because the follower’s specific preferences or the follower’s observations of the leader’s strategy may not align with the rational strategy, and it is not amenable to apriori probability distributions. These conditions especially hold when dealing with human subjects. To address these uncertainties while providing quality guarantees, we propose three new robust algorithms based on mixedinteger linear programs (MILPs) for Bayesian Stackelberg games. A key result of this paper is a detailed experimental analysis that demonstrates that these new MILPs deal better with human responses: a conclusion based on 800 games with 57 human subjects as followers. We also provide runtime results on these MILPs.
Approximation guarantees for fictitious play
 In Proceedings of the 47th Annual Allerton Conference on Communication, Control, and Computing
, 2009
"... Abstract—Fictitious play is a simple, wellknown, and oftenused algorithm for playing (and, especially, learning to play) games. However, in general it does not converge to equilibrium; even when it does, we may not be able to run it to convergence. Still, we may obtain an approximate equilibrium. I ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Fictitious play is a simple, wellknown, and oftenused algorithm for playing (and, especially, learning to play) games. However, in general it does not converge to equilibrium; even when it does, we may not be able to run it to convergence. Still, we may obtain an approximate equilibrium. In this paper, we study the approximation properties that fictitious play obtains when it is run for a limited number of rounds. We show that if both players randomize uniformly over their actions in the first r rounds of fictitious play, then the result is an ǫequilibrium, where ǫ = (r + 1)/(2r). (Since we are examining only a constant number of pure strategies, we know that ǫ < 1/2 is impossible, due to a result of Feder et al.) We show that this bound is tight in the worst case; however, with an experiment on random games, we illustrate that fictitious play usually obtains a much better approximation. We then consider the possibility that the players fail to choose the same r. We show how to obtain the optimal approximation guarantee when both the opponent’s r and the game are adversarially chosen (but there is an upper bound R on the opponent’s r), using a linear program formulation. We show that if the action played in the ith round of fictitious play is chosen with probability proportional to: 1 for i = 1 and 1/(i −1) for all 2 ≤ i ≤ R+1, this gives an approximation guarantee of 1 − 1/(2 + ln R). We also obtain a lower bound of 1 − 4/ln R. This provides an actionable prescription for how long to run fictitious play. I.
Learning Initial Trust among Interacting Agents
"... Abstract. Trust learning is a crucial aspect of information exchange, negotiation, and any other kind of social interaction among autonomous agents in open systems. But most current probabilistic models for computational trust learning lack the ability to take context into account when trying to pre ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Trust learning is a crucial aspect of information exchange, negotiation, and any other kind of social interaction among autonomous agents in open systems. But most current probabilistic models for computational trust learning lack the ability to take context into account when trying to predict future behavior of interacting agents. Moreover, they are not able to transfer knowledge gained in a specific context to a related context. Humans, by contrast, have proven to be especially skilled in perceiving traits like trustworthiness in such socalled initial trust situations. The same restriction applies to most multiagent learning problems. In complex scenarios most algorithms do not scale well to large statespaces and need numerous interactions to learn. We argue that trust related scenarios are best represented in a system of relations to capture semantic knowledge. Following recent work on nonparametric Bayesian models we propose a flexible and context sensitive way to model and learn multidimensional trust values which is particularly well suited to establish trust among strangers without prior relationship. To evaluate our approach we extend a multiagent framework by allowing agents to break an agreed interaction outcome retrospectively. The results suggest that the inherent ability to discover clusters and relationships between clusters that are best supported by the data allows to make predictions about future behavior of agents especially when initial trust is involved.