## Switch Packet Arbitration via Queue-Learning (2001)

Venue: In Proc. NIPS-14

Citations: 5 - 1 self

### Citations

5599 | Reinforcement Learning: An Introduction,
- Sutton, Barto
- 1998
(Show Context)
Citation Context ...es an exact match between RL and the problem task. As shown already every aspect of this problem scales badly. The solution to this problem is three fold. First we use online learning and afterstates =-=[12]-=- to eliminate the need to average over the (N + 1) N possible next states. Second, we show how the value function can yield a set of inputs into a polynomial algorithm for choosing actions. Third, we ... |

698 | An N 5/2 Algorithm for Maximum Matching in Bipartite Graphs
- Hopcroft, Karp
- 1973
(Show Context)
Citation Context ...lynomial in N that considers only the current state q. For instance the problem can be formulated as a so-called matching problem and polynomial algorithms exist that will send the largest a possible =-=[2, 6, 8]-=-. While maximizing the packets sent in every time slot may seem like a solution, the problem is more interesting than this. In general, many possible a will maximize the number of packets that are sen... |

667 |
Data structures and network algorithms
- Tarjan
- 1983
(Show Context)
Citation Context ...2 f0; 1g 8i; j X i a ij 1 8j X j a ij 1 8i This problem can be solved as a linear program and is also known as the weighted matching or the assignment problem which has a polynomial time solution [1=-=3]-=-. In this way, we reduce the search over the O(N !) possible actions to a polynomial time solution. 3.3 Decomposing the Value Function The interaction between queues in the same row or the same column... |

526 | Achieving 100% Throughput in an InputQueued Switch
- Mekkittikul, McKeown
- 1999
(Show Context)
Citation Context ...itches. Many have fixed policies for sending packets that do not depend on the actual patterns of traffic in the network [10]. Under the worse case traffic, these arbitrators can perform quite poorly =-=[8]-=-. Theoretical work has shown consideration of future packet arrivals can have significant impact on the switch performance but is computationally intractable (NP-Hard) to use [4]. As we will show, a d... |

232 | Packet routing in dynamically changing networks: A reinforcement learning approach
- Boyan, Littman
- 1994
(Show Context)
Citation Context ...earning (RL) has been applied to resource allocation problems in telecommunications. e.g., channel allocation in wireless systems, network routing, and admission control in telecommunication networks =-=[1, 3, 7, 11]-=-. These have demonstrated reinforcement learning can find good policies that significantly increase the application reward within the dynamics of the telecommunications problems. However, a key issue ... |

137 | Reinforcement learning for dynamic channel allocation in cellular telephone systems
- Singh, Bertsekas
- 1997
(Show Context)
Citation Context ...earning (RL) has been applied to resource allocation problems in telecommunications. e.g., channel allocation in wireless systems, network routing, and admission control in telecommunication networks =-=[1, 3, 7, 11]-=-. These have demonstrated reinforcement learning can find good policies that significantly increase the application reward within the dynamics of the telecommunications problems. However, a key issue ... |

125 | Algorithms for Reinforcement Learning
- Szepesvári
- 2010
(Show Context)
Citation Context ...the total average cost. We use the Tauberian approximation, that is, we assume the discount factor is close enough to 1 so that the discounted reward policy is equivalent to the average reward policy =-=[5]-=-. Since minimizing the expected value of this cost is equivalent to minimizing the expected wait time, this formulation provides an exact match between RL and the problem task. As shown already every ... |

54 | Call admission control and routing in integrated services networks using neuro-dynamic programming
- Marbach, Mihatsch, et al.
- 2000
(Show Context)
Citation Context ...earning (RL) has been applied to resource allocation problems in telecommunications. e.g., channel allocation in wireless systems, network routing, and admission control in telecommunication networks =-=[1, 3, 7, 11]-=-. These have demonstrated reinforcement learning can find good policies that significantly increase the application reward within the dynamics of the telecommunications problems. However, a key issue ... |

32 |
Switching Theory - Architectures and Performance in Broadband ATM Networks.
- Pattavina
- 1998
(Show Context)
Citation Context ...he switch. A number of packet arbitration strategies have been developed for switches. Many have fixed policies for sending packets that do not depend on the actual patterns of traffic in the network =-=[10]-=-. Under the worse case traffic, these arbitrators can perform quite poorly [8]. Theoretical work has shown consideration of future packet arrivals can have significant impact on the switch performance... |

22 |
Neural network design of a Banyan network controller
- Brown, Liu
- 1990
(Show Context)
Citation Context ...lynomial in N that considers only the current state q. For instance the problem can be formulated as a so-called matching problem and polynomial algorithms exist that will send the largest a possible =-=[2, 6, 8]-=-. While maximizing the packets sent in every time slot may seem like a solution, the problem is more interesting than this. In general, many possible a will maximize the number of packets that are sen... |

16 | Optimizing admission control while ensuring quality of service in multimedia networks via reinforcement learning
- Brown, Tong, et al.
- 1999
(Show Context)
Citation Context |

5 |
Admission Control and Routing in Integrated Service Networks Using Neuro-Dynamic Programming
- Marbach, Tsitsiklis
- 2000
(Show Context)
Citation Context |

3 |
NN Based ATM Scheduling with Queue Length Based Priority Scheme
- Park, Lee
- 1997
(Show Context)
Citation Context ...that are sent. Which one can we send now so that we will be in the best possible state for future time slots? Some heuristics can guide this choice, but these are insensitive to the traffic pattern [=-=9]-=-. Further, it can be shown that to minimize the total wait it may be necessary to send less than the maximum number of packets in the current time slot [4]. So, we look to a solution that efficiently ... |

1 |
Future Information in Input Queueing," submitted to Computer Networks
- Brown, Gabow
- 2001
(Show Context)
Citation Context ... perform quite poorly [8]. Theoretical work has shown consideration of future packet arrivals can have significant impact on the switch performance but is computationally intractable (NP-Hard) to use =-=[4]-=-. As we will show, a dynamic arbitration policy is difficult since the state space, possible transitions, and set of actions all grow exponentially with the size of the switch. In this paper, we consi... |

1 |
Future Information in Input Queueing,” submitted to Computer Networks
- Brown, Gabow
- 2001
(Show Context)
Citation Context ... perform quite poorly [8]. Theoretical work has shown consideration of future packet arrivals can have significant impact on the switch performance but is computationally intractable (NP-Hard) to use =-=[4]-=-. As we will show, a dynamic arbitration policy is difficult since the state space, possible transitions, and set of actions all grow exponentially with the size of the switch. In this paper, we consi... |