35 citations found. Retrieving documents...
O. L. Mangasarian, "Linear and Nonlinear Separation of Patterns by Linear Programming", Operations Research 13, 1965, pp. 444-452.

 Home/Search   Document Not in Database   Summary   Related Articles   Check  

This paper is cited in the following contexts:

First 50 documents

Multicategory Classification by Support Vector Machines - Bredensteiner, Bennett (1999)   (2 citations)  (Correct)

....and e#ciency in evaluation are extremely important. In this paper, we combine two independent but related research directions developed for solving the two class linear discrimination problem. The first is the linear programming (LP) methods stemming from the Multisurface Method of Mangasarian [12, 13]. This method and it s later extension the Robust Linear Programming (RLP) approach [6] have been used in a highly successfully breast cancer diagnosis system [26] The second direction is the quadratic programming (QP) methods based on Vapnik s Statistical Learning Theory [24, 25] Statistical ....

....if A 1 w #e #e A 2 w (2) where e is a vector of ones of the appropriate dimension. If the two classes are linear separable, there are infinitely many planes that separate the two classes. The goal is two choose the plane that will generalize best on future points. Both Mangasarian [12] and Vapnik and Chervonenkis [25] concluded that the best plane in the separable case is the one that minimizes the distance of the closest vector in each class to the separating plane. For the separable case the formulations of Mangasarian s Multi surface Method of Pattern Recognition [13] and ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Linear Programming Boosting via Column Generation - Demiriz, Bennett, Shawe-Taylor (2002)   (2 citations)  (Correct)

....investigated in this paper and using an arcing approach in [19] worked well (see Section 8) Poor performance of hard margin versus soft margin classification methods have been noted in other contexts as well. In a computational study of the hard margin MultisurfaceMethod (MSM) for classification [12] and the soft margin Robust Linear Programming (RLP) method [6] both closely related LP precursors to Vapnik s Support Vector Machine) the soft margin RLP performed uniformly better than the hard margin MSM. In this section we will examine the critical di#erenct between hard and soft margin ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Cancer Diagnosis And Prognosis Via Linear-Programming-Based.. - Street (1994)   (5 citations)  (Correct)

....optimization approaches, in particular linear programming [22] have long been used in problems of pattern separation. Highleyman [37] first found that showing linear separability is equivalent to to finding a nonnegative solution to a set of linear equalities. Both Charnes [20] and Mangasarian [52] developed linear programs which construct planes to separate linearly separable point sets. Mangasarian [52] also showed how to separate sets by a nonlinear surface using linear programming whenever the surface parameters appeared linearly, e.g. a quadratic or polynomial surface. However, these ....

....separation. Highleyman [37] first found that showing linear separability is equivalent to to finding a nonnegative solution to a set of linear equalities. Both Charnes [20] and Mangasarian [52] developed linear programs which construct planes to separate linearly separable point sets. Mangasarian [52] also showed how to separate sets by a nonlinear surface using linear programming whenever the surface parameters appeared linearly, e.g. a quadratic or polynomial surface. However, these formulations were prone to fail on sets that were not separable by a surface linear in its parameters, and ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


The Learning Behavior of Single Neuron Classifiers on Linearly.. - Basu, Ho   (1 citation)  (Correct)

....inequalities constraining the location and orientation of the optimal separating hyperplane. With a properly defined objective function, a separating hyperplane can be obtained by solving a linear programming problem. Several alternative formulations have been proposed in the past ( 3] 5] [10], 15] 17] employing different objective functions. An early survey of these methods is given in [7] Here we mention a few representative formulations. In a very simple formulation described in [15] the objective function is trivial, so that it is simply a test of linear separability by ....

Mangasarian, O.L., Linear and Nonlinear Separation of Patterns by Linear Programming, Operations Research, 13, 1965, 444-452.


Mathematical Programming Approaches To Machine Learning And Data.. - Bradley (1998)   (1 citation)  (Correct)

....function g has the following form: 2 g(x) 8 : 1 if x 2 A 0 if x 2 B: 1) Many different algorithms exist for constructing the approximation g of g and the approximation may have many different functional forms. Examples include a separating plane based function [96, 97], the backpropagation algorithm for artificial neural networks (ANNs) 77, 111] decision tree construction algorithms utilizing various node decision criteria [29, 134, 5] spline methods for classification [165, 162] and probabilistic graphical dependency models [76, 32] Evaluating an ....

.... of (36) is a solution to the following problem: min w;fl;y;z2 X 1 2 kwk 2 2 ; as well as of 49 min w;fl;y;z2 X 1 2 kwk 2 : Nonlinear separating surfaces, which are linear in their parameters, can also easily be handled by the formulations (7) 12) 13) 16) 34) and (35) [96]. If the data are mapped nonlinearly via Phi : R n R , a nonlinear separating surface in R n is easily computed as a linear separator in R . In practice, one usually solves (36) by way of its dual [98] In this formulation, the data enter only as inner products which are computed in ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Linear Programming Boosting via Column Generation - Demiriz, Bennett (2001)   (2 citations)  (Correct)

....paper and using an arcing approach in (R atsch et al. 2000b) worked well (see Section 8) Poor performance of hard margin versus soft margin classi cation methods have been noted in other contexts as well. In a computational study of the hard margin Multisurface Method (MSM) for classi cation (Mangasarian, 1965) and the soft margin Robust Linear Programming (RLP) method (Bennett Mangasarian, 1992) both closely related LP adkpbjstmlj.tex; 7 03 2001; 8:25; p.18 Linear Programming Boosting 19 Hard Margin Solution No Noise Case Figure 1. No noise Hard Margin LP solution for two con dence rated ....

Mangasarian, O. L. (1965). Linear and nonlinear separation of patterns by linear programming. Operations Research, 13, 444-452.


Linear Programming Boosting via Column Generation - Demiriz, Bennett, Shawe-Taylor (2000)   (2 citations)  (Correct)

....investigated in this paper and using an arcing approach in [21] worked well (see Section 8) Poor performance of hard margin versus soft margin classification methods have been noted in other contexts as well. In a computational study of the hard margin MultisurfaceMethod (MSM) for classification [14] and the soft margin Robust Linear Programming (RLP) method [6] both closely related LP precursors to Boser et al. s Support Vector Machine [8, 10] the soft margin RLP performed uniformly better than the hard margin MSM. In this section we will examine the critical di#erence between hard and ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Kernel Methods: A Survey of Current Techniques - Campbell (2000)   (1 citation)  (Correct)

....i.e. relatively fewer datapoints are used. Furthermore, efficient simplex or column generation implementations exist for solving linear programming problems so this is a practical alternative to conventional QP SVMs. This linear programming approach evolved independently of the QP approach to SVMs [17] and, as we will see, linear programming approaches to regression and novelty detection are also possible. 8 3 Novelty Detection. For many real world problems the task is not to classify but to detect novel or abnormal instances. Novelty or abnormality detection has potential applications in ....

O.L. Mangasarian. Linear and Nonlinear Separation of patterns by linear programming. Operations Research 13, p. 444-452, 1965.


Duality and Geometry in SVM Classifiers - Bennett, Bredensteiner (2000)   (15 citations)  (Correct)

....from the dual perspective along with a mathematically rigorous derivation of the ideas behind the geometry. We begin with an explanation of the geometry of SVM based on the idea of convex hulls. For the separable case, this geometric explanation has existed in various forms (Vapnik, 1996; Mangasarian, 1965; Keerthi et al. 1999; Bennett Bredensteiner, in press) The new contribution is the adaptation of the convex hull argument for the inseparable case to the most commmonly used 2 norm and 1 norm soft margin SVM. The primal form resulting from this argument can be regarded as an especially ....

....convex hull argument for the inseparable case to the most commmonly used 2 norm and 1 norm soft margin SVM. The primal form resulting from this argument can be regarded as an especially elegant minor variant of the # SVM formulation (Scholkopf et al. 2000) or a soft margin form of the MSM method (Mangasarian, 1965). Related geometric ideas for the # SVM formulation were developed independently by Crisp and Burges (1999) The primary contributions of this paper are: A simple intuitive explanation of SVM based on (reduced) convex hulls that allows nonexperts to grasp geometrically the main concepts of ....

Mangasarian, O. L. (1965). Linear and nonlinear separation of patterns by linear programming. Operations Research, 13, 444--452.


Sparse Greedy Matrix Approximation for Machine Learning - Smola, Schölkopf (2000)   (38 citations)  (Correct)

.... (classi cation) Sch olkopf et al. 1995] or are close to their target values[Vapnik et al. 1997] However, if the data is noisy the improvement can be negligible [Smola, 1998] Another approach is to add (yet another) regularization term penalizing the 1 norm of the expansion coef cients [Mangasarian, 1965, Chen et al. 1999, Girosi, 1998] However, this does not alleviate the problem that we have to compute (and invert) the matrix K ij : k(x i ; x j ) for i; j 2 [m] m] f1; mg) Since the latter scales with O(m 3 ) except for special matrices) this approach is not suitable for ....

....be seen as a rank n matrix. Furthermore, rank n updates of the inverse of an m m matrix are O(nm 2 ) 3 This will be the topic of further research. 6. 2 Linear Programming Machines A modi cation of the regularization term in the regularized risk functional leads to linear programming machines [Mangasarian, 1965]. While these have a sparsity term already included by penalizing the 1 norm of the expansion coecients, it can be numerically very demanding to solve the optimization problem exactly. Again, by choosing a subset of basis functions to start with (which has approximately the same expressive power ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444-452, 1965.


Barrier Boosting - Rätsch, Warmuth, Mika, Onoda, Lemm.. (2000)   (Correct)

....of a function f is defined as the minimum mar 1 Note, that we could use an arbitrary 0, e.g. k k1 6= 1 and then we would need to normalize the function f . Here, we use the 1 norm for the normalization. 171 gin over all N examples, i.e. N min n=1 Un : 2) A reasonable choice [29, 21, 1, 49] for a convex combination is to maximize the minimum margin of the examples, i.e. choose 2 J such that ( max 2 J ( 3) Roughly speaking, the larger the margin the better the bounds that can be proven for the generalization error (e.g. 49, 1] Also SVMs are based on ....

.... j d 1) Both problems are dual to each other and thus the equality of the theorem follows from the fact that the primal and the dual objective have the same value. Since our hypothesis class is complementation closed, this value is always non negative. The Margin LP Problem was introduced in [29] and was first used for Boosting in [8, 21] 2.2 Boosting and Relative Entropy Minimization We will now use the Edge LP problem to make a connection to a class of Boosting algorithms that use a relative entropy in the objective function [22, 25, 9] In the Totally Corrective Algorithm of [22] ....

O.L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Modeling Languages and Condor: Metacomputing for Optimization - Ferris, Munson (1998)   (Correct)

....By A # # A F and B # # B F we denote the matrices formed from all elements in A and B, where is used to denote the cardinality of a set. The approach considered in this paper attempts to quantify di#erences between the two categories by constructing a separating hyperplane [1, 25], P : x # # F x T w = # with w # # F and # # # such that for all a # A, a T w # and for all b # B, b T w #. If y # # F is an unknown observation we want to classify, we use the following process to categorize it: 1. If y T w # then y likely belongs to category ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Modeling Languages and Condor: Metacomputing for Optimization - Ferris, Munson (1998)   (Correct)

.... jAj ThetaF and B 2 jBj ThetaF we denote the matrices formed from all elements in A and B, where j Delta j is used to denote the cardinality of a set. The approach considered in this paper attempts to quantify differences between the two categories by constructing a separating hyperplane [1, 25], P : fx 2 F j x T w = flg with w 2 F and fl 2 such that for all a 2 A, a T w fl and for all b 2 B, b T w fl. If y 2 F is an unknown observation we want to classify, we use the following process to categorize it: 1. If y T w fl then y likely belongs to category 1. 2. ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Minimal Kernel Classifiers - Fung, Mangasarian, Smola (2002)   (1 citation)  Self-citation (Mangasarian)   (Correct)

....to apply a similar reasoning to the one of [34] which was used in the case of a soft margin loss function [1] This is why, concerning y, we will limit ourselves to an argument derived from statistical learning Theory. The term# v] is commonly referred to as a Regularization Term (see e.g. [14, 35, 24, 39, 36, 33]) which is used to restrict the class of functions admissible for the estimation of the underlying functional dependency. In the present case we have # v] m # i=1 (v i ) # . 32) In a Bayesian setting one identifies# v] with the negative log prior probability to obtain an estimator ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Optimization Methods In Massive Datasets - Bradley, Mangasarian, al.   (5 citations)  Self-citation (Mangasarian)   (Correct)

....programs are solved. In contrast, the formulation here consists of solving a linear program which is considerably less dicult. For simplicity, our results are given here for a linear discriminating surface, i.e. a separating plane. However, extension to nonlinear surfaces such as quadratic [32] or more complex surfaces [12] is straightforward. We next consider the successive overrelaxation (SOR) method for solving massive quadratic programming SVMs. A conventional SVM in its Optimization Methods in Massive Datasets 3 dual formulation contains bound constraints, as well as an equality ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444-452, 1965.


Sparse Kernel Feature Analysis - Smola, Mangasarian, Schölkopf (1999)   (8 citations)  Self-citation (Mangasarian)   (Correct)

....not a modified setting could provide comparable performance in extracting nonlinear features from data, whilst preserving sparsity, i.e. whilst having a compact functional representation. The latter is often achieved in supervised settings by using an 1 penalty on the expansion coefficients [Mangasarian, 1965, Chen et al. 1999, Saunders et al. 1998, Bennett, 1999, Smola, 1998, Bradley and Mangasarian, 1998b, Mangasarian, 1998] Hence it appears promising to use the same approach for feature extraction, too. We derive an explicit algorithm for such a setting that was proposed by Smola [1998] which ....

....independently identically distributed from an underlying probability distribution p(x) and X the corresponding embedding space. Our goal is to compute feature extractors f i (x) that satisfy certain criteria of simplicity (e.g. small RKHS norm [Wahba, 1990, Smola and Scholkopf, 1998] or 1 norm [Mangasarian, 1965, Chen et al. 1999, Bennett, 1999, Smola, 1998, Bradley and Mangasarian, 1998a, Mangasarian, 1998] and optimality (e.g. maximum variance [Hotelling, 1933, Karhunen, 1946] 2.1 Principal Component Analysis (PCA) Let us start with a (slightly nonstandard) formulation of Principal Component ....

[Article contains additional citation context not shown here]

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Mathematical Programming in Neural Networks - Mangasarian (1993)   (21 citations)  Self-citation (Mangasarian)   (Correct)

....can be easily circumvented, in order to obtain some approximate error minimizing linear separation, by considering a slightly different linear program (6) as shown by Theorem 2.1 below. The first linear programming formulations for the linearly separable case were given in 1964 and 1965 [11, 27], but they also suffered from the nullsolution difficulty for the linearly inseparable case. In order to handle the linearly inseparable case one has to employ a more complex map than that provided by an LTU. This was made evident in the early days of neural network development by Minsky and ....

....(1.2) that is a null solution that does not provide any error minimizing separation. For this purpose, we utilize the linear program introduced recently in [8] and which has the following desirable features not all of which are possessed by any other previous linear programming formulation [11, 27, 28, 45, 20, 19]: i) A strict separating plane (that is neither set lies on the separating plane) for linearly separable sets A and B (ii) An error minimizing plane is obtained when the sets A and B are linearly inseparable. iii) No extraneous constraints are used to exclude the null solution for linearly ....

[Article contains additional citation context not shown here]

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Arbitrary-Norm Separating Plane - Mangasarian (1997)   (12 citations)  Self-citation (Mangasarian)   (Correct)

....of discriminating between two finite point sets in n dimensional real space R n . When the convex hulls of the two sets do not intersect, a single linear program can construct a strict separating plane such that each of the two open halfspaces generated by the plane contains one of the two sets [5, 12, 2]. Such a plane corresponds to a perceptron and can also be obtained by the iterative perceptron learning algorithm [21, 10] which can be interpreted as the Motzkin Schoenberg iterative scheme for solving consistent linear inequalities [20] When the convex hulls of the two sets intersect the ....

.... k is convex on R n . Thus the objective function of the mathematical program (17) is convex but its feasible region, which is the unit sphere in the dual norm k Delta k 0 , is not convex. It is precisely this essential nonconvex condition that has been either ignored in most previous work [12, 9, 8, 2] or used heuristically [13, 19] to enforce nonzeroness of w but not as a distance normalization constraint. Thus in these papers, the sum of the distances of misclassified points to the separating plane has not been the real objective function that has been minimized. In fact the nonconvexity of ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Arbitrary-Norm Separating Plane - Mangasarian (1997)   (12 citations)  Self-citation (Mangasarian)   (Correct)

....of discriminating between two finite point sets in n dimensional real space R n . When the convex hulls of the two sets do not intersect, a single linear program can construct a strict separating plane such that each of the two open halfspaces generated by the plane contains one of the two sets [6, 12, 2]. Such a plane corresponds to a perceptron and can also be obtained by the iterative perceptron learning algorithm [22, 10] which can be interpreted as the Motzkin Schoenberg iterative scheme for solving consistent linear inequalities [21] When the convex hulls of the two sets intersect the ....

....norm k Delta k on R m and any convex function h : R n Gamma R m on R n . The feasible region of (17) which is the unit sphere in the dual norm k Delta k 0 , is however not convex. It is precisely this essential nonconvex condition that has been either ignored in most previous work [12, 9, 8, 2] or used heuristically [13, 20] to enforce nonzeroness of w but not as a distance normalization constraint. Thus in these papers, the sum of the distances of misclassified points to the separating plane has not been the real objective function that has been minimized. In [5] a 2 norm error term is ....

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Maximum Throughput Routing of - Traffic In The   (Correct)

No context found.

O. L. Mangasarian, "Linear and Nonlinear Separation of Patterns by Linear Programming", Operations Research 13, 1965, pp. 444-452.


Barrier Boosting - Rätsch, Warmuth, Mika, Onoda, Müller (2000)   (Correct)

No context found.

O.L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Barrier Boosting - Rätsch, Warmuth, Mika, Onoda, Lemm..   (Correct)

No context found.

O.L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Bayesian Kernel Methods - Smola, Schölkopf (2003)   (Correct)

No context found.

O. L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1965.


Simultaneous Feature Selection and Classifier Training via.. - Guo, Dyer (2003)   (Correct)

No context found.

O. L. Mangasarian, Linear and nonlinear separation of patterns by linear programming, Operations Research 13, 444452, 1965.


A Tutorial on Support Vector Regression - Smola, Schölkopf (1998)   (97 citations)  (Correct)

No context found.

O.L. Mangasarian. Linear and nonlinear separation of patterns by linear programming. Operations Research, 13:444--452, 1964.

First 50 documents

Online articles have much greater impact   More about CiteSeer.IST   Add search form to your site   Submit documents   Feedback  

CiteSeer.IST - Copyright Penn State and NEC