Results 1  10
of
4,472,572
Finitetime analysis of the multiarmed bandit problem
 Machine Learning
, 2002
"... Abstract. Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while taking the empirically best action as often as possible. A popular measure of a policyâ€™s success in addressing ..."
Abstract

Cited by 804 (15 self)
 Add to MetaCart
this dilemma is the regret, that is the loss due to the fact that the globally optimal policy is not followed all the times. One of the simplest examples of the exploration/exploitation dilemma is the multiarmed bandit problem. Lai and Robbins were the first ones to show that the regret for this problem has
Virtual time
 ACM Transactions on Programming Languages and Systems
, 1985
"... Virtual time is a new paradigm for organizing and synchronizing distributed systems which can be applied to such problems as distributed discrete event simulation and distributed database concurrency control. Virtual time provides a flexible abstraction of real time in much the same way that virtua ..."
Abstract

Cited by 974 (7 self)
 Add to MetaCart
Virtual time is a new paradigm for organizing and synchronizing distributed systems which can be applied to such problems as distributed discrete event simulation and distributed database concurrency control. Virtual time provides a flexible abstraction of real time in much the same way
Consensus Problems in Networks of Agents with Switching Topology and TimeDelays
, 2003
"... In this paper, we discuss consensus problems for a network of dynamic agents with fixed and switching topologies. We analyze three cases: i) networks with switching topology and no timedelays, ii) networks with fixed topology and communication timedelays, and iii) maxconsensus problems (or leader ..."
Abstract

Cited by 1052 (17 self)
 Add to MetaCart
In this paper, we discuss consensus problems for a network of dynamic agents with fixed and switching topologies. We analyze three cases: i) networks with switching topology and no timedelays, ii) networks with fixed topology and communication timedelays, and iii) maxconsensus problems (or
A Note on the Confinement Problem
, 1973
"... This not explores the problem of confining a program during its execution so that it cannot transmit information to any other program except its caller. A set of examples attempts to stake out the boundaries of the problem. Necessary conditions for a solution are stated and informally justified. ..."
Abstract

Cited by 532 (0 self)
 Add to MetaCart
This not explores the problem of confining a program during its execution so that it cannot transmit information to any other program except its caller. A set of examples attempts to stake out the boundaries of the problem. Necessary conditions for a solution are stated and informally justified.
Where the REALLY Hard Problems Are
 IN J. MYLOPOULOS AND R. REITER (EDS.), PROCEEDINGS OF 12TH INTERNATIONAL JOINT CONFERENCE ON AI (IJCAI91),VOLUME 1
, 1991
"... It is well known that for many NPcomplete problems, such as KSat, etc., typical cases are easy to solve; so that computationally hard cases must be rare (assuming P != NP). This paper shows that NPcomplete problems can be summarized by at least one "order parameter", and that the hard p ..."
Abstract

Cited by 681 (1 self)
 Add to MetaCart
It is well known that for many NPcomplete problems, such as KSat, etc., typical cases are easy to solve; so that computationally hard cases must be rare (assuming P != NP). This paper shows that NPcomplete problems can be summarized by at least one "order parameter", and that the hard
The Hungarian method for the assignment problem
 Naval Res. Logist. Quart
, 1955
"... Assuming that numerical scores are available for the performance of each of n persons on each of n jobs, the "assignment problem" is the quest for an assignment of persons to jobs so that the sum of the n scores so obtained is as large as possible. It is shown that ideas latent in the work ..."
Abstract

Cited by 1238 (0 self)
 Add to MetaCart
Assuming that numerical scores are available for the performance of each of n persons on each of n jobs, the "assignment problem" is the quest for an assignment of persons to jobs so that the sum of the n scores so obtained is as large as possible. It is shown that ideas latent
The Symbol Grounding Problem
, 1990
"... There has been much discussion recently about the scope and limits of purely symbolic models of the mind and about the proper role of connectionism in cognitive modeling. This paper describes the "symbol grounding problem": How can the semantic interpretation of a formal symbol system be m ..."
Abstract

Cited by 1072 (18 self)
 Add to MetaCart
There has been much discussion recently about the scope and limits of purely symbolic models of the mind and about the proper role of connectionism in cognitive modeling. This paper describes the "symbol grounding problem": How can the semantic interpretation of a formal symbol system
Alternatingtime Temporal Logic
 Journal of the ACM
, 1997
"... Temporal logic comes in two varieties: lineartime temporal logic assumes implicit universal quantification over all paths that are generated by system moves; branchingtime temporal logic allows explicit existential and universal quantification over all paths. We introduce a third, more general var ..."
Abstract

Cited by 615 (55 self)
 Add to MetaCart
a certain state. Also the problems of receptiveness, realizability, and controllability can be formulated as modelchecking problems for alternatingtime formulas.
Results 1  10
of
4,472,572