73 citations found. Retrieving documents...
B. T. Polyak. Introduction to Optimization. Optimization Software Inc., New York, 1987.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents  Next 50

Multicategory Proximal Support Vector Machine Classifiers - Fung, Mangasarian   (Correct)

....by I. For numerical function f(x) of x R , the gradient X7f(x) denotes the n x I vector of first partial derivatives of f, while O2f(x) denotes the generalized Hessian n x n matrLx of second partial derivatives of f if they exist, else each row of the generalized Hessian matrix is a subgradient [20, 21] of the corresponding row element of the gradient vector X7f(x) 16, 14] 2 The Linear Multicategory Proximal Support Vector Machine (MPSVM) To motivate our MPSVM we begin with a brief description of the 2 category proxi real support machine formulation [12] We consider the problem, depicted ....

.... by: Vf(A, 7) L pe D(d(A, 7) 7 and, A EAw II II mv A Ee ] 21) 02f( 7) ve EAw ve Ee 1 where E is the diagonal matrLx: E Ddiag( d( D diag( d( 22) and the ( is the step function defined in the Introduction and which is taken here as a specific subgradient [21, 20] of the plus function ( and is used to generate the generalized Hessian matrLx in the same manner as in [16, 14] The Newton refinement procedure can then be summarized as follows. Algorithm 4.1 Newton Refinement Given a solution [ to the PSVM 2 class problem (6) refine it as follows: i) ....

B. T. Polyak. Introduction to Optimization. Optimization Software, Inc., Pub- lications Division, New York, 1987.


Incremental Subgradient Methods For Nondifferentiable.. - Nedic, Bertsekas (2001)   (Correct)

.... x k # k m X i=1 d i,k # , 1.3) where d i,k is a subgradient of f i at xk , #k is a positive stepsize, and PX denotes projection on the set X. There is an extensive theory for this method (see e.g. the textbooks by Dem yanov and Vasil ev [DeV85] Shor [Sho85] Minoux [Min86] Polyak [Pol87], Hiriart Urruty and Lemarechal [HiL93] Bertsekas [Ber99] In many important applications, the set X is simple enough so that the projection can be easily implemented. In particular, for the special case of the dual problem (1.1) 1.2) the set X is the positive orthant and projecting on X is ....

.... follows that if a diminishing stepsize rule (#k # 0) is used and some additional conditions hold, such as P# k=0 # k = #, some of the convergence properties of the incremental method can be derived from known results on # subgradient methods (see e.g. Dem yanov and Vasil ev [DeV85] Polyak [Pol87], p. 144, Correa and Lemarechal [CoL93] Hiriart Urruty and Lemarechal [HiL93] Bertsekas [Ber99] However, the connection with # subgradient methods is not helpful for the convergence analysis under the other stepsize rules that we consider (constant and dynamic) because for these rules # k ....

[Article contains additional citation context not shown here]

Polyak, B. T., Introduction to Optimization, Optimization Software Inc., N.Y., 1987.


Dual Computational Methods - Date Ju Ly   (Correct)

....Notes and Sources 55 8.5 NOTES AND SOURCES Subgradient methods were first introduced in the former Soviet Union during the middle 60s by Shor; the works of Ermoliev and Poljak were also particularly influential. Description of these works are found in many sources, including [Sho85] Erm83] and [Pol87]. An extensive bibliography for the early period of the subject is given in the edited volume by Balinski and Wolfe [BaW75] Incremental gradient methods for di#erentiable cost functions have a long history and find extensive application in the training of neural networks, among other areas. ....

Polyak, B. T., 1987. Introduction to Optimization, Optimization Software Inc., N.Y.


Mathematical Programming Approaches To Machine Learning And Data.. - Bradley (1998)   (1 citation)  (Correct)

....on R n , the supergradient f(x) of f at x is a vector in R n satisfying f(y) Gamma f(x) f(x) y Gamma x) 2) for any y 2 R n . The set D(f(x) of supergradients of f at the point x is nonempty, convex, compact and reduces to the ordinary gradient rf(x) when f is differentiable at x [132, 137]. 11 Chapter 2 Classification via Mathematical Programming We focus on optimization approaches addressing the supervised classification task (Section 1.1) The task is that of assigning a point x = x 1 x 2 : x n ] 0 in n dimensional feature space into one of two disjoint point sets A or ....

B. T. Polyak. Introduction to Optimization. Optimization Software, Inc., Publications Division, New York, 1987.


Incremental Subgradient Methods For Nondifferentiable.. - Nedic, Bertsekas   (Correct)

....x k # k m X i=1 d i,k # , 1.3) where d i,k is a subgradient of f i at x k , # k is a positive stepsize, and PX denotes projection on the set X. There is an extensive theory for this method (see e.g. the textbooks by Dem yanov and Vasil ev [DeV85] Shor [Sho85] Minoux [Min86] Polyak [Pol87], Hiriart Urruty and Lemarechal [HiL93] Bertsekas [Ber99] In many important applications, the set X is simple enough so that the projection can be easily implemented. In particular, for the special case of the dual problem (1.1) 1.2) the set X is the positive orthant and projecting on X is ....

.... that if a diminishing stepsize rule (# k # 0) is used and some additional conditions hold, such as P # k=0 # k = #, some of the convergence properties of the incremental method can be derived from known results on # subgradient methods (see e.g. Dem yanov and Vasil ev [DeV85] Polyak [Pol87], p. 144, Correa and Lemarechal [CoL93] Hiriart Urruty and Lemarechal [HiL93] Bertsekas [Ber99] However, the connection with # subgradient methods is not helpful for the convergence analysis under the other stepsize rules that we consider (constant and dynamic) because for these rules # k ....

[Article contains additional citation context not shown here]

Polyak, B. T., Introduction to Optimization, Optimization Software Inc., N.Y., 1987.


Toward The Optimal Preconditioned Eigensolver: Locally Optimal.. - Knyazev (2000)   (8 citations)  (Correct)

....of the asymptotic average convergence factor: q = 1 p 1 p ; 1 (TA) 1 1 2 : 3. 4) Finally, we would like to remind the reader one of long forgotten versions of the preconditioned conjugate gradient method based on optimization of a three term recurrence, e.g. [31]: y (i 1) y (i) i) v (i) i) y (i) y (i 1) v (i) T ( 1 A B)y (i) 0) 0; 3.5) 6 A. Knyazev, Toward the Optimal Preconditioned Eigensolver with both scalar parameters (i) and (i) computed by minimizing semi norm (3.1) of y (i 1) This ....

....approach to develop an adequate convergence theory may be based on comparison of method (4. 4) with the following stationary tree term recurrence: x (i 1) Bx (i) x (i) x (i 1) where ; are xed scalar parameters, sometimes called the heavy ball method in optimization [31]. However, ever for this simpler method, no accurate convergence theory in terms of the Rayleigh quotient apparently exists yet, cf. 23] In the present paper, we do not prove any new theoretical convergence rate results for Algorithm 4.1, but we suggest a di erent kind of remedy: numerical ....

B. T. Polyak, Introduction to optimization, Optimization Software Inc. Publications Division, New York, 1987. Translated from the Russian, With a foreword by Dimitri P. Bertsekas.


Nonmonotone And Perturbed Optimization - Solodov (1995)   (2 citations)  (Correct)

....1.2.1 A continuous function oe : such that oe(0) 0; oe(t) 0 for t 0, and such that t i 0 and foe(t i )g 0 imply that ft i g 0, is said to be a forcing function. Some typical examples of forcing functions are ct; ct 2 for some c 0. We now state a classical lemma ([51],p.6) that will be used later, as well as another lemma (a slight modification of [51] p.44) used in the proof of Theorem 1.2.1. Lemma 1.2.1 Let f( Delta) 2 C 1 L ( n ) then jf(y) Gamma f(x) Gamma hrf(x) y Gamma xij L 2 ky Gamma xk 2 8x; y 2 n : Lemma 1.2.2 Let fa i g and ....

....0, and such that t i 0 and foe(t i )g 0 imply that ft i g 0, is said to be a forcing function. Some typical examples of forcing functions are ct; ct 2 for some c 0. We now state a classical lemma ( 51] p. 6) that will be used later, as well as another lemma (a slight modification of [51],p.44) used in the proof of Theorem 1.2.1. Lemma 1.2.1 Let f( Delta) 2 C 1 L ( n ) then jf(y) Gamma f(x) Gamma hrf(x) y Gamma xij L 2 ky Gamma xk 2 8x; y 2 n : Lemma 1.2.2 Let fa i g and fffl i g be two sequences of real numbers such that ffl i 0; P 1 i=0 ffl i ....

[Article contains additional citation context not shown here]

B.T. Polyak. Introduction to Optimization. Optimization Software, Inc., Publications Division, New York, 1987.


Proximal Interior Point Approach for Solving Convex.. - Kaplan, Tichatschke (1998)   (Correct)

....1, or with i 0 = 2, s 0 = 0 if s(1) 1, ensures the validity of the statements (i) and (ii) It is easy to see that, with any u 2 U S , inequality (46) is true for each i and inequality (43) is true for each (i, s) 0 s s(i) 1. From (45) and (58) by means of Lemma 2.2. 2 in [8], we can conclude that the sequence ku i,0 u k converges. But, due to (43) and (44) ku i,0 u k ku i,s u k ku i 1,0 u k p # i s 2m i,s(i) # i r i # i # i holds true for 0 s s(i) 1. Hence, the whole sequence ku i,s u k converges ....

....u k is guaranteed if the relations (62) 63) and 1 X i=1 s h i # i 1, 1 X i=1 r m i,s(i) # i r i 1, 1 X i=1 # i # i 1. 64) are jointly regarded. Conclusion (iii) follows immediately from the inequalities (43) 46) see the proof of Theorem 1) and Lemma 2.2. 2 in [8]. Therefore, in case K is a bounded set, we obtain 22 A. Kaplan and R. Tichatschke Theorem 2 Let K S 1 , and the Assumptions 1(i) iv) vii) be fulfilled for Problem (1) with # 2# 1 . Suppose that the controlling parameters of the PIPmethod satisfy the conditions (10) 13) 62) 63) ....

B.T. Polyak. Introduction to Optimization. Optimization Software, Inc. Publ. Division, New York, 1987.


Direct Search Generalized Simplex Algorithm for.. - Hassan Shekarforoush.. (1995)   (Correct)

....employ these information are usually refered to as zero order methods and the simplex algorithm indeed belongs to this class of methods. A major drawback of first or second order methods is the fact that they are usually very effective, only under conditions close enough to ideal (see for example [8]) This is mainly due to the fact that derivative operators usually amplify noise contents and hence increase the sensitivity of these algorithms. Zero order methods do not suffer from this, at the expense of increase of computational cost. However, unfortunately, most zero order methods, have not ....

....For proof see [2] Definition 4 Let f(x) be a convex function on IR n . Then any vector f(x) 2 IR n for which: f(x y) f(x) h f(x) yi ; x; y 2 IR n (35) is called a subgradient of f at the point x. h f(x) yi denotes the scalar product of the two vectors. From the continuity of f [8], we know that there exists at least one such a vector at any point. However, this vector is not necessarily unique everywhere. Remark 2 The convexity of a function in IR n is equivalent to the monotonicity of the subgradient [8] h f(x) Gamma f(y) x Gamma yi 0 (36) For a convex function ....

[Article contains additional citation context not shown here]

B. T. Polyak. Introduction To Optimization. Optimization Software, Inc., 1987.


Semi-Supervised Support Vector Machines for Unlabeled Data.. - Fung, Mangasarian (2001)   (4 citations)  (Correct)

.... r i s i 1 s i # = 0. 2.3) For a concave function f : R n # R the supergradient #(f(x) of f at x is a vector in R n satisfying: f(y) f(x) # #f(x) y x) for all y # R n . The supergradient reduces to the ordinary gradient #f(x) when f is di#erentiable at x [12, 13]. In our case e # min r, s : R 2p # R is a non di#erentiable concave function and its supergradient is given by: 4 #(e # min r, s ) # p j=1 # # # # # # # # # # # # # # # # # # # # # # # # I j 0 p # if r j s j (1 #) # I j 0 p # # # 0 p I j # if r j = s j ....

B. T. Polyak. Introduction to Optimization. Optimization Software, Inc., Publications Division, New York, 1987.


Optimization Methods In Massive Datasets - Bradley, Mangasarian, al.   (5 citations)  (Correct)

....strictly lower triangular part of the symmetric matrix M , and E 2 R m m is the positive diagonal of M , then a necessary Optimization Methods in Massive Datasets 25 and sucient optimality condition for (1. 45) for positive semide nite M is the following gradient projection optimality condition [53, 31]: u = u E 1 (Mu e) # ; 0; 1.47) where ( # denotes the 2 norm projection on the feasible region S of (1.45) that is: u) # ) i = 8 : 0 if u i 0 u i if 0 u i if u i 9 = i = 1; m: 1.48) Our SOR method, which is a matrix splitting method that ....

B. T. Polyak. Introduction to Optimization. Optimization Software, Inc., Publications Division, New York, 1987.


On the Convergence of Conditional Epsilon-Subgradient .. - Larsson, Patriksson, .. (2000)   (Correct)

...., Michael Patriksson y and Ann Brith Stromberg z Revised April 13, 2000 Abstract The paper provides two contributions. First, we present new convergence results for conditional subgradient algorithms for general convex programs. The results obtained here extend the classical ones by Polyak [Pol67, Pol69, Pol87] as well as the recent ones in [CoL93, LPS96, AIS98] to a broader framework. Secondly, we establish the application of this technique to solve nonstrictly convex concave saddle point problems, such as primal dual formulations of linear programs. Contrary to several previous solution algorithms ....

B. T. Polyak, Introduction to Optimization, Optimization Software, New York, NY, 1987.


Policy Gradient in Continuous Time - Munos (2006)   (Correct)

No context found.

B. T. Polyak. Introduction to Optimization. Optimization Software Inc., New York, 1987.


Least-Squares Covariance Matrix Adjustment - Stephen Boyd Lin   (Correct)

No context found.

B. T. Polyak. Introduction to Optimization. Optimization Software, 1987.


Scalable Data Parallel Algorithms for Texture.. - Bader.. (1993)   (5 citations)  (Correct)

No context found.

B. T. Polyak. Introduction to Optimization. Optimization Software, Inc., New York, 1987.


A Bayesian Non-Linear Approach for Electrical Impedance.. - Martin, Idier (1997)   (3 citations)  (Correct)

No context found.

B. Polyak, Introduction to optimization, Optimization Software,Inc., 1987.


Rescaling and Stepsize Selection in Proximal Methods using.. - Silva, al. (2001)   (Correct)

No context found.

B. Polyak. Introduction to Optimization. Optimization Software Inc., New York, 1987.


Distributed Asynchronous Incremental Subgradient Methods - Nedic, Bertsekas, Borkar (2000)   (1 citation)  (Correct)

No context found.

B. T. Polyak, Introduction to Optimization (Optimization Software Inc., New York, 1987).


Alternating Directions Methods for the Parallel.. - Spyridon..   (Correct)

No context found.

B.T. Polyak. Introduction to optimization. Optimization Software, Inc., Publications Division, 1987.


A Branch-and-Bound Approach for Solving a Class of.. - Levitin Tichatsch Ke (1998)   (1 citation)  (Correct)

No context found.

Polyak B.T., Introduction to Optimization, Optimization Software, Inc. Publ. Division, New York, 1987.


An Infeasible Interior Proximal Method for Convex.. - Yamashita.. (2000)   (Correct)

No context found.

B.T. Polyak, Introduction to Optimization, Optimization Software Inc., New York, 1987.


Some Methods Based on the D-Gap Function for Solving Monotone .. - Solodov, Tseng (2000)   (Correct)

No context found.

B.T. Polyak, Introduction to Optimization, Optimization Software, Inc., Publications Division, New York, New York, 1987.


A Practical General Approximation Criterion for Methods of.. - Eckstein (2000)   (Correct)

No context found.

B. Polyak. Introduction to Optimization (Optimization Software Inc., New York, 1987).


An Infeasible Interior Proximal Method for Convex.. - Yamashita.. (2000)   (Correct)

No context found.

B.T. Polyak, Introduction to Optimization, Optimization Software Inc., New York, 1987.


Rescaling and Stepsize Selection in Proximal Methods using.. - Silva, al.   (Correct)

No context found.

B. Polyak. Introduction to Optimization. Optimization Software Inc., New York, 1987.

First 50 documents  Next 50

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC