#### DMCA

## Stochastic Approximations and Perturbations in Forward-Backward Splitting for Monotone Operators *

### Citations

952 |
A stochastic approximation method
- Robbins, Monro
- 1954
(Show Context)
Citation Context ...r JγnA. Let (Ω,F ,P) be the underlying probability space. An H-valued random variable is a measurable map x : (Ω,F) → (H,B) and, for every p ∈ [1,+∞[, Lp(Ω,F ,P;H) denotes the space of equivalence classes of H-valued random variable x such that ∫ Ω ‖x‖pdP < +∞. Algorithm 1.3 Consider the setting of Problem 1.1. Let x0, (un)n∈N, and (an)n∈N be random variables in L2(Ω,F ,P;H), let (λn)n∈N be a sequence in ]0, 1], and let (γn)n∈N be a sequence in ]0, 2ϑ[. Set (∀n ∈ N) xn+1 = xn + λn ( JγnA(xn − γnun) + an − xn ) . (1.5) The first instances of the stochastic iteration (1.5) can be traced back to [44] in the context of the gradient method, i.e., when A = 0 and B is the gradient of a convex function. Stochastic approximations in the gradient method were then investigated in the Russian literature of the late 1960s and early 1970s [27, 28, 29, 33, 42, 49]. Stochastic gradient methods have also been used extensively in adaptive signal processing, in control, and in machine learning, e.g., [3, 36, 54]. More generally, proximal stochastic gradient methods have been applied to various problems; see for instance [1, 26, 45, 48, 55]. The objective of the present paper is to provide an analysis of ... |

500 | Signal recovery by proximal forwardbackward splitting. Multiscale Modeling and Simulation
- Combettes, Wajs
- 2005
(Show Context)
Citation Context ...bra B. A large array of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to solve Problem 1.1 is the ... |

433 | T.: A first-order primal-dual algorithm for convex problems with applications to imaging
- Chambolle, Pock
- 2011
(Show Context)
Citation Context ...βn) 6 αn‖(x, v)‖V + βn, (5.25) where { αn = √ 2‖V‖1/2‖(1 + 2‖U‖1/2‖L‖)max{1, ‖WL∗‖}‖V−1‖1/2αn βn = ‖V‖1/2‖(1 + 2‖U‖1/2‖L‖)βn. (5.26) Thus, ∑ n∈N √ λnαn < +∞ and ∑ n∈N λnβn < +∞. Finally, (5.16) and (g) guarantee that supn∈N(1 + τn)γn < 2ϑ. All the assumptions of Proposition 4.4 are therefore satisfied for algorithm (5.17), which concludes the proof. Remark 5.4 (i) Algorithm 5.10 can be viewed as a stochastic version of the primal-dual algorithm investigated in [20, Example 6.4] when the metric is fixed in the latter. Particular cases of such fixed metric primal-algorithm can be found in [12, 16, 30, 34, 35]. (ii) The same type of primal-dual algorithm is investigated in [5, 43] in a different context since in those papers the stochastic nature of the algorithms stems from the random activation of blocks of variables. 22 5.2 Example We illustrate an implementation of Algorithm 5.2 in a simple scenario with H = RN by constructing an example in which the gradient approximation conditions are fulfilled. For every k ∈ {1, . . . , q} and every n ∈ N, set sk,n = ∇j∗k(vk,n) and suppose that (yn)n∈N is almost surely bounded. This assumption is satisfied, in particular, if dom f and (bn)n∈N are bounded. I... |

396 |
Minimization Methods for Non-Differentiable Functions
- Shor
- 1985
(Show Context)
Citation Context ...ch that ∫ Ω ‖x‖pdP < +∞. Algorithm 1.3 Consider the setting of Problem 1.1. Let x0, (un)n∈N, and (an)n∈N be random variables in L2(Ω,F ,P;H), let (λn)n∈N be a sequence in ]0, 1], and let (γn)n∈N be a sequence in ]0, 2ϑ[. Set (∀n ∈ N) xn+1 = xn + λn ( JγnA(xn − γnun) + an − xn ) . (1.5) The first instances of the stochastic iteration (1.5) can be traced back to [44] in the context of the gradient method, i.e., when A = 0 and B is the gradient of a convex function. Stochastic approximations in the gradient method were then investigated in the Russian literature of the late 1960s and early 1970s [27, 28, 29, 33, 42, 49]. Stochastic gradient methods have also been used extensively in adaptive signal processing, in control, and in machine learning, e.g., [3, 36, 54]. More generally, proximal stochastic gradient methods have been applied to various problems; see for instance [1, 26, 45, 48, 55]. The objective of the present paper is to provide an analysis of the stochastic forward-backward method in the context of Algorithm 1.3. Almost sure convergence of the iterates (xn)n∈N to a solution to Problem 1.1 will be established under general conditions on the sequences (un)n∈N, (an)n∈N, (γn)n∈N, and (λn)n∈N. In par... |

342 | Stochastic Limit Theory - Davidson - 1994 |

332 |
Probability in Banach spaces: Isoperimetry and processes
- Ledoux, Talagrand
- 1991
(Show Context)
Citation Context ...ugh this will not always be expressly mentioned. Let E be a sub sigma-algebra of F , let x ∈ L1(Ω,F ,P;H), and let y ∈ L1(Ω, E ,P;H). Then y is the conditional expectation of x with respect to E if (∀E ∈ E) ∫ E xdP = ∫ E ydP; in this case we write y = E(x |E). We have ( ∀x ∈ L1(Ω,F ,P;H) ) ‖E(x |E)‖ 6 E(‖x‖|E). (2.4) In addition, L2(Ω,F ,P;H) is a Hilbert space and ( ∀x ∈ L2(Ω,F ,P;H) ) { ‖E(x |E)‖2 6 E(‖x‖2 |E) (∀u ∈ H) E(〈x |u〉 |E) = 〈E(x |E) |u〉. (2.5) Geometrically, if x ∈ L2(Ω,F ,P;H), E(x |E) is the projection of x onto L2(Ω, E ,P;H). For background on probability in Hilbert spaces, see [32, 37]. 3 An asymptotic principle In this section, we establish an asymptotic principle which will lay the foundation for the convergence analysis of our stochastic forward-backward algorithm. First, we need the following result. Proposition 3.1 Let F be a nonempty closed subset of H, let φ : [0,+∞[ → [0,+∞[ be a strictly increasing function such that limt→+∞ φ(t) = +∞, let (xn)n∈N be a sequence of H-valued random variables, and let (Xn)n∈N be a sequence of sub-sigma-algebras of F such that (∀n ∈ N) σ(x0, . . . , xn) ⊂ Xn ⊂ Xn+1. (3.1) 4 Suppose that, for every z ∈ F, there exist (ϑn(z))n∈N ∈ ℓ+(X )... |

322 |
Adaptive Signal Processing
- Widrow, Stearns
- 1985
(Show Context)
Citation Context ...equence in ]0, 1], and let (γn)n∈N be a sequence in ]0, 2ϑ[. Set (∀n ∈ N) xn+1 = xn + λn ( JγnA(xn − γnun) + an − xn ) . (1.5) The first instances of the stochastic iteration (1.5) can be traced back to [44] in the context of the gradient method, i.e., when A = 0 and B is the gradient of a convex function. Stochastic approximations in the gradient method were then investigated in the Russian literature of the late 1960s and early 1970s [27, 28, 29, 33, 42, 49]. Stochastic gradient methods have also been used extensively in adaptive signal processing, in control, and in machine learning, e.g., [3, 36, 54]. More generally, proximal stochastic gradient methods have been applied to various problems; see for instance [1, 26, 45, 48, 55]. The objective of the present paper is to provide an analysis of the stochastic forward-backward method in the context of Algorithm 1.3. Almost sure convergence of the iterates (xn)n∈N to a solution to Problem 1.1 will be established under general conditions on the sequences (un)n∈N, (an)n∈N, (γn)n∈N, and (λn)n∈N. In particular, a feature of our analysis is that it allows for relaxation parameters and it does not require that the proximal parameter sequence (γn)n∈N... |

278 |
Combettes, Convex Analysis and Monotone Operator Theory in Hilbert Spaces
- Bauschke, L
- 2010
(Show Context)
Citation Context ...r, H is a separable real Hilbert space with scalar product 〈· |·〉, associated norm ‖ · ‖, and Borel σ-algebra B. A large array of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.... |

257 |
Finite-dimensional variational inequalities and complementarity problems
- Facchinei, Pang
(Show Context)
Citation Context ...is are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to solve Problem 1.1 is the forward-backward algorithm [14, 38, 52], which constructs a sequence (xn)n∈... |

133 | Solving monotone inclusions via compositions of nonexpansive averaged operators
- Combettes
- 2004
(Show Context)
Citation Context ...learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to solve Problem 1.1 is the forward-backward algorithm [14, 38, 52], which constructs a sequence (xn)n∈N in H by iterating (∀n ∈ N) xn+1 = JγnA(xn − γnBxn), where 0 < γn < 2ϑ. (1.4) Recent theoretical advances on deterministic versions of this algorithm can be found in [6, 11, 20, 22]. Let us also stress that a major motivation for studying the forward-backward algorithm is that it can be applied not only to Problem 1.1 per se, but also to systems of coupled monotone inclusions via product space reformulations [2], to strongly monotone composite inclusions problems via duality arguments [15, 20], and to primal-dual composite problems via renorming in the prim... |

129 | Efficient online and batch learning using forward backward splitting
- Duchi, Singer
- 2009
(Show Context)
Citation Context ... in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to solve Problem 1.1 is the forward-backward algorithm [14, 38, 52],... |

126 | Applications of a splitting algorithm to decomposition in convex programming and variational inequalities,
- Tseng
- 1991
(Show Context)
Citation Context ...r, H is a separable real Hilbert space with scalar product 〈· |·〉, associated norm ‖ · ‖, and Borel σ-algebra B. A large array of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.... |

102 | Stochastic dual coordinate ascent methods for regularized loss
- Shalev-Shwartz, Zhang
(Show Context)
Citation Context ...1.5) The first instances of the stochastic iteration (1.5) can be traced back to [44] in the context of the gradient method, i.e., when A = 0 and B is the gradient of a convex function. Stochastic approximations in the gradient method were then investigated in the Russian literature of the late 1960s and early 1970s [27, 28, 29, 33, 42, 49]. Stochastic gradient methods have also been used extensively in adaptive signal processing, in control, and in machine learning, e.g., [3, 36, 54]. More generally, proximal stochastic gradient methods have been applied to various problems; see for instance [1, 26, 45, 48, 55]. The objective of the present paper is to provide an analysis of the stochastic forward-backward method in the context of Algorithm 1.3. Almost sure convergence of the iterates (xn)n∈N to a solution to Problem 1.1 will be established under general conditions on the sequences (un)n∈N, (an)n∈N, (γn)n∈N, and (λn)n∈N. In particular, a feature of our analysis is that it allows for relaxation parameters and it does not require that the proximal parameter sequence (γn)n∈N be vanishing. Our proofs are based on properties of stochastic quasi-Fejer iterations [18], for which we provide a novel converg... |

95 |
A general framework for a class of first order primal-dual algorithms for convex optimization in imaging science
- Esser, Zhang, et al.
- 2010
(Show Context)
Citation Context ...βn) 6 αn‖(x, v)‖V + βn, (5.25) where { αn = √ 2‖V‖1/2‖(1 + 2‖U‖1/2‖L‖)max{1, ‖WL∗‖}‖V−1‖1/2αn βn = ‖V‖1/2‖(1 + 2‖U‖1/2‖L‖)βn. (5.26) Thus, ∑ n∈N √ λnαn < +∞ and ∑ n∈N λnβn < +∞. Finally, (5.16) and (g) guarantee that supn∈N(1 + τn)γn < 2ϑ. All the assumptions of Proposition 4.4 are therefore satisfied for algorithm (5.17), which concludes the proof. Remark 5.4 (i) Algorithm 5.10 can be viewed as a stochastic version of the primal-dual algorithm investigated in [20, Example 6.4] when the metric is fixed in the latter. Particular cases of such fixed metric primal-algorithm can be found in [12, 16, 30, 34, 35]. (ii) The same type of primal-dual algorithm is investigated in [5, 43] in a different context since in those papers the stochastic nature of the algorithms stems from the random activation of blocks of variables. 22 5.2 Example We illustrate an implementation of Algorithm 5.2 in a simple scenario with H = RN by constructing an example in which the gradient approximation conditions are fulfilled. For every k ∈ {1, . . . , q} and every n ∈ N, set sk,n = ∇j∗k(vk,n) and suppose that (yn)n∈N is almost surely bounded. This assumption is satisfied, in particular, if dom f and (bn)n∈N are bounded. I... |

71 | Primal-dual splitting algorithm for solving inclusions with mixtures of composite, Lipschitzian, and parallel-sum type monotone operators,
- Combettes, Pesquet
- 2012
(Show Context)
Citation Context ...sume the general form of Problem 1.1 and the primal solution can trivially be recovered from any dual solution. In (5.1), z ∈ H, ρ ∈ ]0,+∞[ and, for every k ∈ {1, . . . , q}, rk lies in a real Hilbert space Gk, Bk : Gk → 2Gk is maximally monotone, Dk : Gk → 2Gk is maximally monotone and strongly monotone, Bk Dk = (B −1 k +D −1 k ) −1, and Lk ∈ B (H,Gk). In such instances the forward-backward algorithm actually yields a primal-dual method which produces a sequence converging to the primal solution (see [20, Section 5] for details). Now suppose that, in addition, C : H → H is cocoercive. As in [17], consider the primal problem find x ∈ H such that z ∈ Ax+ q∑ k=1 L∗k ( (Bk Dk)(Lkx− rk) ) + Cx, (5.2) together with the dual problem find v1 ∈ G1, . . . , vq ∈ Gq such that (∀k ∈ {1, . . . , q}) − rk ∈ −L∗k(A + C)−1 ( z − q∑ l=1 L∗l vl ) + B−1k vk + D −1 k vk. (5.3) Using renorming techniques in the primal-dual space going back to [34] in the context of finitedimensional minimization problems, the primal-dual problem (5.2)–(5.3) can be reduced to an instance of Problem 1.1 [20, 53] (see also [23]) and therefore solved via Theorem 4.1. Next, we explicitly illustrate an application of this app... |

60 | A variational formulation for framebased inverse problems
- Chaux, Combettes, et al.
(Show Context)
Citation Context ...orm ‖ · ‖, and Borel σ-algebra B. A large array of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to s... |

58 |
A splitting algorithm for dual monotone inclusions involving cocoercive operators,”
- Vu
- 2013
(Show Context)
Citation Context ...ucts a sequence (xn)n∈N in H by iterating (∀n ∈ N) xn+1 = JγnA(xn − γnBxn), where 0 < γn < 2ϑ. (1.4) Recent theoretical advances on deterministic versions of this algorithm can be found in [6, 11, 20, 22]. Let us also stress that a major motivation for studying the forward-backward algorithm is that it can be applied not only to Problem 1.1 per se, but also to systems of coupled monotone inclusions via product space reformulations [2], to strongly monotone composite inclusions problems via duality arguments [15, 20], and to primal-dual composite problems via renorming in the primal-dual space [20, 53]. Thus, new developments on (1.4) lead to new algorithms for solving these problems. Our paper addresses the following stochastic version of (1.4) in which, at each iteration n, un stands for a stochastic approximation to Bxn and an stands for a stochastic perturbation modeling 2 the approximate implementation of the resolvent operator JγnA. Let (Ω,F ,P) be the underlying probability space. An H-valued random variable is a measurable map x : (Ω,F) → (H,B) and, for every p ∈ [1,+∞[, Lp(Ω,F ,P;H) denotes the space of equivalence classes of H-valued random variable x such that ∫ Ω ‖x‖pdP < +∞. Al... |

56 | A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms,”
- Condat
- 2013
(Show Context)
Citation Context ...(see [20, Section 5] for details). Now suppose that, in addition, C : H → H is cocoercive. As in [17], consider the primal problem find x ∈ H such that z ∈ Ax+ q∑ k=1 L∗k ( (Bk Dk)(Lkx− rk) ) + Cx, (5.2) together with the dual problem find v1 ∈ G1, . . . , vq ∈ Gq such that (∀k ∈ {1, . . . , q}) − rk ∈ −L∗k(A + C)−1 ( z − q∑ l=1 L∗l vl ) + B−1k vk + D −1 k vk. (5.3) Using renorming techniques in the primal-dual space going back to [34] in the context of finitedimensional minimization problems, the primal-dual problem (5.2)–(5.3) can be reduced to an instance of Problem 1.1 [20, 53] (see also [23]) and therefore solved via Theorem 4.1. Next, we explicitly illustrate an application of this approach in the special case when (5.2)–(5.3) is a minimization problem. 5.1 A stochastic primal-dual minimization method We denote by Γ0(H) the class of proper lower semicontinuous convex functions. The Moreau subdifferential of f ∈ Γ0(H) is the maximally monotone operator ∂f : H → 2H : x 7→ { u ∈ H ∣∣ (∀y ∈ H) 〈y − x |u〉+ f(x) 6 f(y) } . (5.4) The inf-convolution of f : H → ]−∞,+∞] and h : H → ]−∞,+∞] is defined as f h : H → [−∞,+∞] : x 7→ infy∈H ( f(y) + h(x − y) ) . The conjugate of a function f ... |

47 | Non-asymptotic analysis of stochastic approximation algorithms for machine learning
- Bach, Moulines
(Show Context)
Citation Context ...equence in ]0, 1], and let (γn)n∈N be a sequence in ]0, 2ϑ[. Set (∀n ∈ N) xn+1 = xn + λn ( JγnA(xn − γnun) + an − xn ) . (1.5) The first instances of the stochastic iteration (1.5) can be traced back to [44] in the context of the gradient method, i.e., when A = 0 and B is the gradient of a convex function. Stochastic approximations in the gradient method were then investigated in the Russian literature of the late 1960s and early 1970s [27, 28, 29, 33, 42, 49]. Stochastic gradient methods have also been used extensively in adaptive signal processing, in control, and in machine learning, e.g., [3, 36, 54]. More generally, proximal stochastic gradient methods have been applied to various problems; see for instance [1, 26, 45, 48, 55]. The objective of the present paper is to provide an analysis of the stochastic forward-backward method in the context of Algorithm 1.3. Almost sure convergence of the iterates (xn)n∈N to a solution to Problem 1.1 will be established under general conditions on the sequences (un)n∈N, (an)n∈N, (γn)n∈N, and (λn)n∈N. In particular, a feature of our analysis is that it allows for relaxation parameters and it does not require that the proximal parameter sequence (γn)n∈N... |

31 | A parallel splitting method for coupled monotone inclusions,
- Attouch, Briceno-Arias, et al.
- 2010
(Show Context)
Citation Context ...hroughout the paper, H is a separable real Hilbert space with scalar product 〈· |·〉, associated norm ‖ · ‖, and Borel σ-algebra B. A large array of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to min... |

31 | Convergence analysis of primal-dual algorithms for a saddle-point problem: from contraction perspective,”
- He, Yuan
- 2012
(Show Context)
Citation Context ...nd Lk ∈ B (H,Gk). In such instances the forward-backward algorithm actually yields a primal-dual method which produces a sequence converging to the primal solution (see [20, Section 5] for details). Now suppose that, in addition, C : H → H is cocoercive. As in [17], consider the primal problem find x ∈ H such that z ∈ Ax+ q∑ k=1 L∗k ( (Bk Dk)(Lkx− rk) ) + Cx, (5.2) together with the dual problem find v1 ∈ G1, . . . , vq ∈ Gq such that (∀k ∈ {1, . . . , q}) − rk ∈ −L∗k(A + C)−1 ( z − q∑ l=1 L∗l vl ) + B−1k vk + D −1 k vk. (5.3) Using renorming techniques in the primal-dual space going back to [34] in the context of finitedimensional minimization problems, the primal-dual problem (5.2)–(5.3) can be reduced to an instance of Problem 1.1 [20, 53] (see also [23]) and therefore solved via Theorem 4.1. Next, we explicitly illustrate an application of this approach in the special case when (5.2)–(5.3) is a minimization problem. 5.1 A stochastic primal-dual minimization method We denote by Γ0(H) the class of proper lower semicontinuous convex functions. The Moreau subdifferential of f ∈ Γ0(H) is the maximally monotone operator ∂f : H → 2H : x 7→ { u ∈ H ∣∣ (∀y ∈ H) 〈y − x |u〉+ f(x) 6 f(y) } . ... |

30 | A proximal stochastic gradient method with progressive variance reduction,
- Xiao, Zhang
- 2014
(Show Context)
Citation Context ...1.5) The first instances of the stochastic iteration (1.5) can be traced back to [44] in the context of the gradient method, i.e., when A = 0 and B is the gradient of a convex function. Stochastic approximations in the gradient method were then investigated in the Russian literature of the late 1960s and early 1970s [27, 28, 29, 33, 42, 49]. Stochastic gradient methods have also been used extensively in adaptive signal processing, in control, and in machine learning, e.g., [3, 36, 54]. More generally, proximal stochastic gradient methods have been applied to various problems; see for instance [1, 26, 45, 48, 55]. The objective of the present paper is to provide an analysis of the stochastic forward-backward method in the context of Algorithm 1.3. Almost sure convergence of the iterates (xn)n∈N to a solution to Problem 1.1 will be established under general conditions on the sequences (un)n∈N, (an)n∈N, (γn)n∈N, and (λn)n∈N. In particular, a feature of our analysis is that it allows for relaxation parameters and it does not require that the proximal parameter sequence (γn)n∈N be vanishing. Our proofs are based on properties of stochastic quasi-Fejer iterations [18], for which we provide a novel converg... |

16 | Proximal algorithms for multicomponent image recovery problems,
- Briceno-Arias, Combettes, et al.
- 2011
(Show Context)
Citation Context ...r product 〈· |·〉, associated norm ‖ · ‖, and Borel σ-algebra B. A large array of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6=... |

16 | Dualization of signal recovery problems,
- Combettes, Dung, et al.
- 2010
(Show Context)
Citation Context ...r product 〈· |·〉, associated norm ‖ · ‖, and Borel σ-algebra B. A large array of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6=... |

15 | Variable metric forward-backward splitting with applications to monotone inclusions in duality,
- Combettes, Vu
- 2014
(Show Context)
Citation Context ...ction 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to solve Problem 1.1 is the forward-backward algorithm [14, 38, 52], which constructs a sequence (xn)n∈N in H by iterating (∀n ∈ N) xn+1 = JγnA(xn − γnBxn), where 0 < γn < 2ϑ. (1.4) Recent theoretical advances on deterministic versions of this algorithm can be found in [6, 11, 20, 22]. Let us also stress that a major motivation for studying the forward-backward algorithm is that it can be applied not only to Problem 1.1 per se, but also to systems of coupled monotone inclusions via product space reformulations [2], to strongly monotone composite inclusions problems via duality arguments [15, 20], and to primal-dual composite problems via renorming in the primal-dual space [20, 53]. Thus, new developments on (1.4) lead to new algorithms for solving these problems. Our paper addresses the following stochastic version of (1.4) in which, at each iteration n, un stands for a st... |

12 |
A wavelet-based regularized reconstruction algorithm for SENSE parallel MRI with applications to neuroimaging,
- Chaari, Pesquet, et al.
- 2011
(Show Context)
Citation Context ...r product 〈· |·〉, associated norm ‖ · ‖, and Borel σ-algebra B. A large array of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6=... |

11 |
Topics in Finite Element Solution of Elliptic Problems (Lectures on
- Mercier
- 1979
(Show Context)
Citation Context ...he following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to solve Problem 1.1 is the forward-backward algorithm [14, 38, 52], which constructs a sequence (xn)n∈N in H by iterating ... |

10 | Monotone operator methods for Nash equilibria in non-potential games,
- Briceno-Arias, Combettes
- 2013
(Show Context)
Citation Context ...ert space with scalar product 〈· |·〉, associated norm ‖ · ‖, and Borel σ-algebra B. A large array of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumpti... |

10 |
Methods for digital restoration of signals degraded by a stochastic impulse response,
- Combettes, Trussell
- 1989
(Show Context)
Citation Context ...d every n ∈ N, set sk,n = ∇j∗k(vk,n) and suppose that (yn)n∈N is almost surely bounded. This assumption is satisfied, in particular, if dom f and (bn)n∈N are bounded. In addition, let (∀n ∈ N) Xn = σ ( x0,v0, (Kn′ , zn′)06n′<mn , (bn′ , cn′)16n′<n ) , (5.27) where (mn)n∈N is a strictly increasing sequence in N such that mn = O(n 1+δ) with δ ∈ ]0,+∞[, (Kn)n∈N is a sequence of independent and identically distributed (i.i.d.) random matrices of R M×N , and (zn)n∈N is a sequence of i.i.d. random vectors of R M . For example, in signal recovery, (Kn)n∈N may model a stochastic degradation operators [19], while (zn)n∈N are observations related to an unknown signal that we want to estimate. The variables (Kn, zn)n∈N are supposed to be independent of (bn, cn)n∈N and such that E‖K0‖4 < +∞ and E‖z0‖4 < +∞. Set (∀x ∈ H) h(x) = 1 2 E‖K0x− z0‖2 (5.28) and, for every n ∈ N, let un = 1 mn+1 mn+1−1∑ n′=0 K⊤n′(Kn′xn − zn′) (5.29) be an empirical estimate of ∇h(xn). We assume that λn = O(n−κ) where κ ∈ ]1− δ, 1] ∩ [0, 1]. We have (∀n ∈ N) E(un |Xn)−∇h(xn) = 1 mn+1 ( Q0,mnxn − r0,mn ) (5.30) where, for every (n1, n2) ∈ N2 such that n1 < n2, Qn1,n2 = n2−1∑ n′=n1 ( K⊤n′Kn′ − E(K⊤0 K0) ) and rn1,n2 = n2−1∑ n... |

10 | Which fixed point does the iteration method select? - Lemaire - 1997 |

7 | On stochastic proximal gradient algorithms,
- Atchade, Fort, et al.
- 2014
(Show Context)
Citation Context ...1.5) The first instances of the stochastic iteration (1.5) can be traced back to [44] in the context of the gradient method, i.e., when A = 0 and B is the gradient of a convex function. Stochastic approximations in the gradient method were then investigated in the Russian literature of the late 1960s and early 1970s [27, 28, 29, 33, 42, 49]. Stochastic gradient methods have also been used extensively in adaptive signal processing, in control, and in machine learning, e.g., [3, 36, 54]. More generally, proximal stochastic gradient methods have been applied to various problems; see for instance [1, 26, 45, 48, 55]. The objective of the present paper is to provide an analysis of the stochastic forward-backward method in the context of Algorithm 1.3. Almost sure convergence of the iterates (xn)n∈N to a solution to Problem 1.1 will be established under general conditions on the sequences (un)n∈N, (an)n∈N, (γn)n∈N, and (λn)n∈N. In particular, a feature of our analysis is that it allows for relaxation parameters and it does not require that the proximal parameter sequence (γn)n∈N be vanishing. Our proofs are based on properties of stochastic quasi-Fejer iterations [18], for which we provide a novel converg... |

6 | A forward-backward view of some primal-dual optimization methods in image recovery,
- Combettes, Condat, et al.
- 2014
(Show Context)
Citation Context ...βn) 6 αn‖(x, v)‖V + βn, (5.25) where { αn = √ 2‖V‖1/2‖(1 + 2‖U‖1/2‖L‖)max{1, ‖WL∗‖}‖V−1‖1/2αn βn = ‖V‖1/2‖(1 + 2‖U‖1/2‖L‖)βn. (5.26) Thus, ∑ n∈N √ λnαn < +∞ and ∑ n∈N λnβn < +∞. Finally, (5.16) and (g) guarantee that supn∈N(1 + τn)γn < 2ϑ. All the assumptions of Proposition 4.4 are therefore satisfied for algorithm (5.17), which concludes the proof. Remark 5.4 (i) Algorithm 5.10 can be viewed as a stochastic version of the primal-dual algorithm investigated in [20, Example 6.4] when the metric is fixed in the latter. Particular cases of such fixed metric primal-algorithm can be found in [12, 16, 30, 34, 35]. (ii) The same type of primal-dual algorithm is investigated in [5, 43] in a different context since in those papers the stochastic nature of the algorithms stems from the random activation of blocks of variables. 22 5.2 Example We illustrate an implementation of Algorithm 5.2 in a simple scenario with H = RN by constructing an example in which the gradient approximation conditions are fulfilled. For every k ∈ {1, . . . , q} and every n ∈ N, set sk,n = ∇j∗k(vk,n) and suppose that (yn)n∈N is almost surely bounded. This assumption is satisfied, in particular, if dom f and (bn)n∈N are bounded. I... |

5 | A stochastic coordinate descent primal-dual algorithm and applications to large-scale composite optimization,
- Bianchi, Hachem, et al.
- 2014
(Show Context)
Citation Context ...}‖V−1‖1/2αn βn = ‖V‖1/2‖(1 + 2‖U‖1/2‖L‖)βn. (5.26) Thus, ∑ n∈N √ λnαn < +∞ and ∑ n∈N λnβn < +∞. Finally, (5.16) and (g) guarantee that supn∈N(1 + τn)γn < 2ϑ. All the assumptions of Proposition 4.4 are therefore satisfied for algorithm (5.17), which concludes the proof. Remark 5.4 (i) Algorithm 5.10 can be viewed as a stochastic version of the primal-dual algorithm investigated in [20, Example 6.4] when the metric is fixed in the latter. Particular cases of such fixed metric primal-algorithm can be found in [12, 16, 30, 34, 35]. (ii) The same type of primal-dual algorithm is investigated in [5, 43] in a different context since in those papers the stochastic nature of the algorithms stems from the random activation of blocks of variables. 22 5.2 Example We illustrate an implementation of Algorithm 5.2 in a simple scenario with H = RN by constructing an example in which the gradient approximation conditions are fulfilled. For every k ∈ {1, . . . , q} and every n ∈ N, set sk,n = ∇j∗k(vk,n) and suppose that (yn)n∈N is almost surely bounded. This assumption is satisfied, in particular, if dom f and (bn)n∈N are bounded. In addition, let (∀n ∈ N) Xn = σ ( x0,v0, (Kn′ , zn′)06n′<mn , (bn′ , cn′... |

5 | Stochastic quasi-Fejer blockcoordinate fixed point iterations with random sweeping,” arXiv preprint arXiv:1404.7536,
- Combettes, Pesquet
- 2014
(Show Context)
Citation Context ...lems; see for instance [1, 26, 45, 48, 55]. The objective of the present paper is to provide an analysis of the stochastic forward-backward method in the context of Algorithm 1.3. Almost sure convergence of the iterates (xn)n∈N to a solution to Problem 1.1 will be established under general conditions on the sequences (un)n∈N, (an)n∈N, (γn)n∈N, and (λn)n∈N. In particular, a feature of our analysis is that it allows for relaxation parameters and it does not require that the proximal parameter sequence (γn)n∈N be vanishing. Our proofs are based on properties of stochastic quasi-Fejer iterations [18], for which we provide a novel convergence result. The organization of the paper is as follows. The notation is introduced in Section 2. Section 3 provides an asymptotic principle which will be used in Section 4 to present the main result on the weak and strong convergence of the iterates of Algorithm 1.3. Finally, Section 5 deals with applications and proposes a stochastic primal-dual method. 2 Notation Id denotes the identity operator on H and ⇀ and → denote, respectively, weak and strong convergence. The sets of weak and strong sequential cluster points of a sequence (xn)n∈N in H are denote... |

5 |
On the method of generalized stochastic gradients and quasi-Fejér sequences. Cybernetics 5
- Ermol’ev
- 1969
(Show Context)
Citation Context ...ch that ∫ Ω ‖x‖pdP < +∞. Algorithm 1.3 Consider the setting of Problem 1.1. Let x0, (un)n∈N, and (an)n∈N be random variables in L2(Ω,F ,P;H), let (λn)n∈N be a sequence in ]0, 1], and let (γn)n∈N be a sequence in ]0, 2ϑ[. Set (∀n ∈ N) xn+1 = xn + λn ( JγnA(xn − γnun) + an − xn ) . (1.5) The first instances of the stochastic iteration (1.5) can be traced back to [44] in the context of the gradient method, i.e., when A = 0 and B is the gradient of a convex function. Stochastic approximations in the gradient method were then investigated in the Russian literature of the late 1960s and early 1970s [27, 28, 29, 33, 42, 49]. Stochastic gradient methods have also been used extensively in adaptive signal processing, in control, and in machine learning, e.g., [3, 36, 54]. More generally, proximal stochastic gradient methods have been applied to various problems; see for instance [1, 26, 45, 48, 55]. The objective of the present paper is to provide an analysis of the stochastic forward-backward method in the context of Algorithm 1.3. Almost sure convergence of the iterates (xn)n∈N to a solution to Problem 1.1 will be established under general conditions on the sequences (un)n∈N, (an)n∈N, (γn)n∈N, and (λn)n∈N. In par... |

5 | Stability of the iteration method for nonexpansive mappings,
- Lemaire
- 1996
(Show Context)
Citation Context ...r, H is a separable real Hilbert space with scalar product 〈· |·〉, associated norm ‖ · ‖, and Borel σ-algebra B. A large array of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.... |

5 | Convergence of stochastic proximal gradient algorithm,
- Rosasco, Villa, et al.
- 2014
(Show Context)
Citation Context |

3 | On the convergence rate of a forward-backward type primal-dual splitting algorithm for convex optimization problems,
- Bot, Csetnek
- 2015
(Show Context)
Citation Context ...ction 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to solve Problem 1.1 is the forward-backward algorithm [14, 38, 52], which constructs a sequence (xn)n∈N in H by iterating (∀n ∈ N) xn+1 = JγnA(xn − γnBxn), where 0 < γn < 2ϑ. (1.4) Recent theoretical advances on deterministic versions of this algorithm can be found in [6, 11, 20, 22]. Let us also stress that a major motivation for studying the forward-backward algorithm is that it can be applied not only to Problem 1.1 per se, but also to systems of coupled monotone inclusions via product space reformulations [2], to strongly monotone composite inclusions problems via duality arguments [15, 20], and to primal-dual composite problems via renorming in the primal-dual space [20, 53]. Thus, new developments on (1.4) lead to new algorithms for solving these problems. Our paper addresses the following stochastic version of (1.4) in which, at each iteration n, un stands for a st... |

3 |
A consistent algorithm to solve Lasso, elastic-net and Tikhonov regularization,
- Vito, Umanita, et al.
- 2011
(Show Context)
Citation Context ...ray of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to solve Problem 1.1 is the forward-backward ... |

3 |
Vecteurs, Fonctions et Distributions Aleatoires dans les Espaces de Hilbert. Hermes,
- Fortet
- 1995
(Show Context)
Citation Context ...ugh this will not always be expressly mentioned. Let E be a sub sigma-algebra of F , let x ∈ L1(Ω,F ,P;H), and let y ∈ L1(Ω, E ,P;H). Then y is the conditional expectation of x with respect to E if (∀E ∈ E) ∫ E xdP = ∫ E ydP; in this case we write y = E(x |E). We have ( ∀x ∈ L1(Ω,F ,P;H) ) ‖E(x |E)‖ 6 E(‖x‖|E). (2.4) In addition, L2(Ω,F ,P;H) is a Hilbert space and ( ∀x ∈ L2(Ω,F ,P;H) ) { ‖E(x |E)‖2 6 E(‖x‖2 |E) (∀u ∈ H) E(〈x |u〉 |E) = 〈E(x |E) |u〉. (2.5) Geometrically, if x ∈ L2(Ω,F ,P;H), E(x |E) is the projection of x onto L2(Ω, E ,P;H). For background on probability in Hilbert spaces, see [32, 37]. 3 An asymptotic principle In this section, we establish an asymptotic principle which will lay the foundation for the convergence analysis of our stochastic forward-backward algorithm. First, we need the following result. Proposition 3.1 Let F be a nonempty closed subset of H, let φ : [0,+∞[ → [0,+∞[ be a strictly increasing function such that limt→+∞ φ(t) = +∞, let (xn)n∈N be a sequence of H-valued random variables, and let (Xn)n∈N be a sequence of sub-sigma-algebras of F such that (∀n ∈ N) σ(x0, . . . , xn) ⊂ Xn ⊂ Xn+1. (3.1) 4 Suppose that, for every z ∈ F, there exist (ϑn(z))n∈N ∈ ℓ+(X )... |

2 |
Iterative Optimization in Inverse Problems.
- Byrne
- 2014
(Show Context)
Citation Context ...orm ‖ · ‖, and Borel σ-algebra B. A large array of problems arising in Hilbertian nonlinear analysis are captured by the following simple formulation. Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to s... |

2 |
On the convergence of the iterates of the “fast iterative shrinkage/thresholding algorithm,”
- Chambolle, Dossal
- 2015
(Show Context)
Citation Context ...ction 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to solve Problem 1.1 is the forward-backward algorithm [14, 38, 52], which constructs a sequence (xn)n∈N in H by iterating (∀n ∈ N) xn+1 = JγnA(xn − γnBxn), where 0 < γn < 2ϑ. (1.4) Recent theoretical advances on deterministic versions of this algorithm can be found in [6, 11, 20, 22]. Let us also stress that a major motivation for studying the forward-backward algorithm is that it can be applied not only to Problem 1.1 per se, but also to systems of coupled monotone inclusions via product space reformulations [2], to strongly monotone composite inclusions problems via duality arguments [15, 20], and to primal-dual composite problems via renorming in the primal-dual space [20, 53]. Thus, new developments on (1.4) lead to new algorithms for solving these problems. Our paper addresses the following stochastic version of (1.4) in which, at each iteration n, un stands for a st... |

2 | Compositions and convex combinations of averaged nonexpansive operators,
- Combettes, Yamada
- 2015
(Show Context)
Citation Context |

2 |
A stochastic inertial forward-backward splitting algorithm for multivariate monotone inclusions,
- Rosasco, Villa, et al.
- 2015
(Show Context)
Citation Context ...st sure convergence properties are established under the following assumptions: (γn)n∈N is a decreasing sequence in ]0, ϑ] such that ∑ n∈N γn = +∞, λn ≡ 1, an ≡ 0, and the sequence (xn)n∈N is bounded a priori. (iii) In [46], Problem 1.1 is addressed using Algorithm 1.3. The authors make the additional assumptions that (∀n ∈ N) E(un |Xn) = Bxn and an = 0. (4.33) Furthermore they employ vanishing proximal parameters (γn)n∈N. Almost sure convergence properties of the sequence (xn)n∈N are then established under the additional assumption that B is uniformly monotone. (iv) The recently posted paper [47] employs tools from [18] to investigate the convergence of a variant of (1.5) in which no errors (an)n∈N are allowed in the implementation of the resolvents, and an inertial term is added, namely, (∀n ∈ N) xn+1 = xn + λn ( JγnA(xn + ρn(xn − xn−1)− γnun)− xn ) , where ρn ∈ [0, 1[ . (4.34) In the case when ρn ≡ 0, assertions (iii) and (iv)(h) of Theorem 4.1 are obtained under the additional hypothesis that inf λn > 0 and the stochastic approximations which can be performed are constrained by (4.33). Next, we provide a version of Theorem 3.2 in which a variant of (1.5) featuring approximations (A... |

1 |
Some methods of stochastic optimization,
- Ermoliev, Nekrylova
- 1966
(Show Context)
Citation Context ...ch that ∫ Ω ‖x‖pdP < +∞. Algorithm 1.3 Consider the setting of Problem 1.1. Let x0, (un)n∈N, and (an)n∈N be random variables in L2(Ω,F ,P;H), let (λn)n∈N be a sequence in ]0, 1], and let (γn)n∈N be a sequence in ]0, 2ϑ[. Set (∀n ∈ N) xn+1 = xn + λn ( JγnA(xn − γnun) + an − xn ) . (1.5) The first instances of the stochastic iteration (1.5) can be traced back to [44] in the context of the gradient method, i.e., when A = 0 and B is the gradient of a convex function. Stochastic approximations in the gradient method were then investigated in the Russian literature of the late 1960s and early 1970s [27, 28, 29, 33, 42, 49]. Stochastic gradient methods have also been used extensively in adaptive signal processing, in control, and in machine learning, e.g., [3, 36, 54]. More generally, proximal stochastic gradient methods have been applied to various problems; see for instance [1, 26, 45, 48, 55]. The objective of the present paper is to provide an analysis of the stochastic forward-backward method in the context of Algorithm 1.3. Almost sure convergence of the iterates (xn)n∈N to a solution to Problem 1.1 will be established under general conditions on the sequences (un)n∈N, (an)n∈N, (γn)n∈N, and (λn)n∈N. In par... |

1 |
The method of stochastic gradients and its application,
- Ermoliev, Nekrylova
- 1967
(Show Context)
Citation Context |

1 |
The rate of convergence of the method of generalized stochastic gradients,
- Guseva
- 1971
(Show Context)
Citation Context |

1 |
Solution of certain variational problems and control problems by the stochastic gradient method,
- Nekrylova
- 1974
(Show Context)
Citation Context |

1 | A stochastic forward-backward splitting method for solving monotone inclusions in Hilbert spaces,
- Rosasco, Villa, et al.
- 2014
(Show Context)
Citation Context ...les are provided in [2, Proposition 2.4]. Remark 4.3 To place our analysis in perspective, we comment on results of the literature that seem the most pertinently related to Theorem 4.1. (i) In the deterministic case, Theorem 4.1(iii) can be found in [14, Corollary 6.5]. (ii) In [1, Corollary 8], Problem 1.2 is considered in the special case when H = RN and solved via (1.5). Almost sure convergence properties are established under the following assumptions: (γn)n∈N is a decreasing sequence in ]0, ϑ] such that ∑ n∈N γn = +∞, λn ≡ 1, an ≡ 0, and the sequence (xn)n∈N is bounded a priori. (iii) In [46], Problem 1.1 is addressed using Algorithm 1.3. The authors make the additional assumptions that (∀n ∈ N) E(un |Xn) = Bxn and an = 0. (4.33) Furthermore they employ vanishing proximal parameters (γn)n∈N. Almost sure convergence properties of the sequence (xn)n∈N are then established under the additional assumption that B is uniformly monotone. (iv) The recently posted paper [47] employs tools from [18] to investigate the convergence of a variant of (1.5) in which no errors (an)n∈N are allowed in the implementation of the resolvents, and an inertial term is added, namely, (∀n ∈ N) xn+1 = xn + λ... |

1 |
A closer look at consistent operator splitting and its extensions for topology optimization,
- Talischi, Paulino
- 2015
(Show Context)
Citation Context .... Problem 1.1 Let A : H → 2H be a set-valued maximally monotone operator, let ϑ ∈ ]0,+∞[, and let B : H → H be a ϑ-cocoercive operator, i.e., (∀x ∈ H)(∀y ∈ H) 〈x− y |Bx− By〉 > ϑ‖Bx− By‖2, (1.1) such that F = { z ∈ H ∣∣ 0 ∈ Az+ Bz } 6= ∅. (1.2) The problem is to find a point in F. Instances of Problem 1.1 are found in areas such as evolution inclusions [2], optimization [4, 38, 51], Nash equilibria [7], image recovery [8, 10, 15], inverse problems [9, 13], signal processing [21], statistics [25], machine learning [26], variational inequalities [31, 52], mechanics [40, 41], and structure design [50]. For instance, an important specialization of Problem 1.1 in the context of convex optimization is the following [4, Section 27.3]. Problem 1.2 Let f : H → ]−∞,+∞] be a proper lower semicontinuous convex function, let ϑ ∈ ]0,+∞[, and let g : H → R be a differentiable convex function such that ∇g is ϑ−1-Lipschitz continuous on H. The problem is to minimize x∈H f(x) + g(x), (1.3) under the assumption that F = Argmin(f + g) 6= ∅. A standard method to solve Problem 1.1 is the forward-backward algorithm [14, 38, 52], which constructs a sequence (xn)n∈N in H by iterating (∀n ∈ N) xn+1 = JγnA(xn − γ... |