Results 1  10
of
46
Dual averaging methods for regularized stochastic learning and online optimization
 In Advances in Neural Information Processing Systems 23
, 2009
"... We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as ℓ1norm for promoting sparsity. We develop extensions of Nes ..."
Abstract

Cited by 131 (7 self)
 Add to MetaCart
We consider regularized stochastic learning and online optimization problems, where the objective function is the sum of two convex terms: one is the loss function of the learning task, and the other is a simple regularization term such as ℓ1norm for promoting sparsity. We develop extensions of Nesterov’s dual averaging method, that can exploit the regularization structure in an online setting. At each iteration of these methods, the learning variables are adjusted by solving a simple minimization problem that involves the running average of all past subgradients of the loss function and the whole regularization term, not just its subgradient. In the case of ℓ1regularization, our method is particularly effective in obtaining sparse solutions. We show that these methods achieve the optimal convergence rates or regret bounds that are standard in the literature on stochastic and online convex optimization. For stochastic learning problems in which the loss functions have Lipschitz continuous gradients, we also present an accelerated version of the dual averaging method.
Constrained consensus and optimization in multiagent networks
 IEEE TRANSACTIONS ON AUTOMATIC CONTROL
, 2008
"... We present distributed algorithms that can be used by multiple agents to align their estimates with a particular value over a network with timevarying connectivity. Our framework is general in that this value can represent a consensus value among multiple agents or an optimal solution of an optimiz ..."
Abstract

Cited by 114 (6 self)
 Add to MetaCart
(Show Context)
We present distributed algorithms that can be used by multiple agents to align their estimates with a particular value over a network with timevarying connectivity. Our framework is general in that this value can represent a consensus value among multiple agents or an optimal solution of an optimization problem, where the global objective function is a combination of local agent objective functions. Our main focus is on constrained problems where the estimate of each agent is restricted to lie in a different constraint set. To highlight the effects of constraints, we first consider a constrained consensus problem and present a distributed “projected consensus algorithm ” in which agents combine their local averaging operation with projection on their individual constraint sets. This algorithm can be viewed as a version of an alternating projection method with weights that are varying over time and across agents. We establish convergence and convergence rate results for the projected consensus algorithm. We next study a constrained optimization problem for optimizing the
Distributed stochastic subgradient projection algorithms for convex optimization
 Journal of Optimization Theory and Applications
, 2010
"... Abstract. We consider a distributed multiagent network system where the goal is to minimize a sum of convex objective functions of the agents subject to a common convex constraint set. Each agent maintains an iterate sequence and communicates the iterates to its neighbors. Then, each agent combines ..."
Abstract

Cited by 82 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We consider a distributed multiagent network system where the goal is to minimize a sum of convex objective functions of the agents subject to a common convex constraint set. Each agent maintains an iterate sequence and communicates the iterates to its neighbors. Then, each agent combines weighted averages of the received iterates with its own iterate, and adjusts the iterate by using subgradient information (known with stochastic errors) of its own function and by projecting onto the constraint set. The goal of this paper is to explore the effects of stochastic subgradient errors on the convergence of the algorithm. We first consider the behavior of the algorithm in mean, and then the convergence with probability 1 and in mean square. We consider general stochastic errors that have uniformly bounded second moments and obtain bounds on the limiting performance of the algorithm in mean for diminishing and nondiminishing stepsizes. When the means of the errors diminish, we prove that there is mean consensus between the agents and mean convergence to the optimum function value for diminishing stepsizes. When the mean errors diminish sufficiently fast, we strengthen the results to consensus and convergence of the iterates to an optimal solution with probability 1 and in mean square.
Asynchronous gossip algorithms for stochastic optimization
 In Proceedings of the 48th IEEE Conference on Decision and Control
, 2009
"... Abstract — We consider a distributed multiagent network system where the goal is to minimize an objective function that can be written as the sum of component functions, each of which is known partially (with stochastic errors) to a specific network agent. We propose an asynchronous algorithm that ..."
Abstract

Cited by 23 (1 self)
 Add to MetaCart
(Show Context)
Abstract — We consider a distributed multiagent network system where the goal is to minimize an objective function that can be written as the sum of component functions, each of which is known partially (with stochastic errors) to a specific network agent. We propose an asynchronous algorithm that is motivated by random gossip schemes where each agent has a local Poisson clock. At each tick of its local clock, the agent averages its estimate with a randomly chosen neighbor and adjusts the average using the gradient of its local function that is computed with stochastic errors. We investigate the convergence properties of the algorithm for two different classes of functions. First, we consider differentiable, but not necessarily convex functions, and prove that the gradients converge to zero with probability 1. Then, we consider convex, but not necessarily differentiable functions, and show that the iterates converge to an optimal solution almost surely. I.
Distributed MultiAgent Optimization with StateDependent Communication
, 2010
"... We study distributed algorithms for solving global optimization problems in which the objective function is the sum of local objective functions of agents and the constraint set is given by the intersection of local constraint sets of agents. We assume that each agent knows only his own local obje ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
(Show Context)
We study distributed algorithms for solving global optimization problems in which the objective function is the sum of local objective functions of agents and the constraint set is given by the intersection of local constraint sets of agents. We assume that each agent knows only his own local objective function and constraint set, and exchanges information with the other agents over a randomly varying network topology to update his information state. We assume a statedependent communication model over this topology: communication is Markovian with respect to the states of the agents and the probability with which the links are available depends on the states of the agents. In this paper, we study a projected multiagent subgradient algorithm under statedependent communication. The algorithm involves each agent performing a local averaging to combine his estimate with the other agents’ estimates, taking a subgradient step along his local objective function, and projecting the estimates
NewtonRaphson consensus for distributed convex optimization
 In CDC and European Control Conference
, 2011
"... Abstract — We study the problem of unconstrained distributed optimization in the context of multiagents systems subject to limited communication connectivity. In particular we focus on the minimization of a sum of convex cost functions, where each component of the global function is available only ..."
Abstract

Cited by 21 (9 self)
 Add to MetaCart
(Show Context)
Abstract — We study the problem of unconstrained distributed optimization in the context of multiagents systems subject to limited communication connectivity. In particular we focus on the minimization of a sum of convex cost functions, where each component of the global function is available only to a specific agent and can thus be seen as a private local cost. The agents need to cooperate to compute the minimizer of the sum of all costs. We propose a consensuslike strategy to estimate a NewtonRaphson descending update for the local estimates of the global minimizer at each agent. In particular, the algorithm is based on the separation of timescales principle and it is proved to converge to the global minimizer if a specific parameter that tunes the rate of convergence is chosen sufficiently small. We also provide numerical simulations and compare them with alternative distributed optimization strategies like the Alternating Direction Method of Multipliers and the Distributed Subgradient Method. Index Terms — distributed optimization, convex optimization, consensus algorithms, multiagent systems, NewtonRaphson methods I.
Distributed and nonautonomous power control through distributed convex optimization
 IEEE INFOCOM
"... Abstract — We consider the uplink power control problem where mobile users in different cells are communicating with their base stations. We formulate the power control problem as the minimization of a sum of convex functions. Each component function depends on the channel coefficients from all the ..."
Abstract

Cited by 16 (5 self)
 Add to MetaCart
(Show Context)
Abstract — We consider the uplink power control problem where mobile users in different cells are communicating with their base stations. We formulate the power control problem as the minimization of a sum of convex functions. Each component function depends on the channel coefficients from all the mobile users to a specific base station and is assumed to be known only to that base station (only CSIR). We then view the power control problem as a distributed optimization problem that is to be solved by the base stations and propose convergent, distributed and iterative power control algorithms. These algorithms require each base station to communicate with the base stations in its neighboring cells in each iteration and are hence nonautonomous. Since the base stations are connected through a wired backbone the communication overhead is not an issue. The convergence of the algorithms is shown theoretically and also verified through numerical simulations. I.
On stochastic gradient and subgradient methods with adaptive steplength sequences
 Automatica
, 2012
"... Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochastic optimization problems. However, the performance of standard SA implementations can vary significantly based on the choice of the steplength sequence, and in general, little guidance is provided abo ..."
Abstract

Cited by 15 (4 self)
 Add to MetaCart
(Show Context)
Traditionally, stochastic approximation (SA) schemes have been popular choices for solving stochastic optimization problems. However, the performance of standard SA implementations can vary significantly based on the choice of the steplength sequence, and in general, little guidance is provided about good choices. Motivated by this gap, in the first part of the paper, we present two adaptive steplength schemes for strongly convex differentiable stochastic optimization problems, equipped with convergence theory, that aim to overcome some of the reliance on userspecific parameters. Of these, the first scheme, referred to as a recursive steplength stochastic approximation (RSA) scheme, optimizes the error bounds to derive a rule that expresses the steplength at a given iteration as a simple function of the steplength at the previous iteration and certain problem parameters. The second scheme, termed as a cascading steplength stochastic approximation (CSA) scheme, maintains the steplength sequence as a piecewiseconstant decreasing function with the reduction in the steplength occurring when a suitable error threshold is met. In the second part of the paper, we allow for nondifferentiable objectives but with bounded subgradients over a certain domain. In such a regime, we propose a local smoothing technique, based on random local perturbations of the objective function, that leads to a differentiable approximation of the function. Assuming a uniform distribution on the local randomness, we establish a Lipschitzian property for the gradient of the approximation and prove that the obtained Lipschitz bound grows at a modest rate with problem size. This facilitates the development of an adaptive steplength stochastic approximation framework, which now requires sampling in the product space of the original measure and the artificially introduced distribution. The resulting adaptive steplength schemes are applied to three stochastic optimization problems. In particular, we observe that both schemes perform well in practice and display markedly less reliance on userdefined parameters. I.
Greedy gossip with eavesdropping
 IN PROC. IEEE INT. SYMP. ON WIRELESS PERVASIVE COMPUTING
, 2008
"... This paper presents greedy gossip with eavesdropping (GGE), a new average consensus algorithm for wireless sensor network applications. Consensus algorithms have recently received much attention in the sensor network community because of their simplicity and completely decentralized nature which ma ..."
Abstract

Cited by 15 (6 self)
 Add to MetaCart
This paper presents greedy gossip with eavesdropping (GGE), a new average consensus algorithm for wireless sensor network applications. Consensus algorithms have recently received much attention in the sensor network community because of their simplicity and completely decentralized nature which makes them robust to changes in the network topology and unreliable wireless networking environments. In the sensor network, each node has a measurement value and the aim of average consensus is computing the average of these node values in the absence of a central authority. We prove that GGE converges to the average consensus with probability one. We also illustrate the performance of the algorithm via simulations and conclude that GGE provides a significant performance improvement compared to existing average consensus algorithms such as randomized gossip and geographic gossip.
Single timescale regularized stochastic approximation schemes for monotone nash games under uncertainty
 Proceedings of the IEEE Conference on Decision and Control (CDC
, 2010
"... Abstract—In this paper, we consider the distributed computation of equilibria arising in monotone stochastic Nash games over continuous strategy sets. Such games arise in settings when the gradient map of the player objectives is a monotone mapping over the cartesian product of strategy sets, leadi ..."
Abstract

Cited by 10 (9 self)
 Add to MetaCart
(Show Context)
Abstract—In this paper, we consider the distributed computation of equilibria arising in monotone stochastic Nash games over continuous strategy sets. Such games arise in settings when the gradient map of the player objectives is a monotone mapping over the cartesian product of strategy sets, leading to a monotone stochastic variational inequality. We consider the application of projectionbased stochastic approximation schemes. However, such techniques are characterized by a key shortcoming: they can accommodate strongly monotone mappings only. In fact, standard extensions of stochastic approximation schemes for merely monotone mappings require the solution of a sequence of related strongly monotone problems, a natively twotimescale scheme. Accordingly, we consider the development of single timescale techniques for computing equilibria when the associated gradient map does not admit strong monotonicity. We first show that, under suitable assumptions, standard projection schemes can indeed be extended to allow for strict, rather than strong monotonicity. Furthermore, we introduce a class of regularized stochastic approximation schemes, in which the regularization parameter is updated at every step, leading to a single timescale method. The scheme is a stochastic extension of an iterative Tikhonov regularization method and its global convergence is established. To aid in networked implementations, we consider an extension to this result where players are allowed to choose their steplengths independently and show if the deviation across their choices is suitably constrained, then the convergence of the scheme may be claimed. I.