Results 11  20
of
63
New algorithms for multiclass cancer diagnosis using tumor gene expression signatures
 Bioinformatics
, 2003
"... Motivation: The increasing use of DNA microarraybased tumor gene expression profiles for cancer diagnosis requires mathematical methods with high accuracy for solving clustering, feature selection and classification problems of gene expression data. Results: New algorithms are developed for solving ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
Motivation: The increasing use of DNA microarraybased tumor gene expression profiles for cancer diagnosis requires mathematical methods with high accuracy for solving clustering, feature selection and classification problems of gene expression data. Results: New algorithms are developed for solving clustering, feature selection and classification problems of gene expression data. The clustering algorithm is based on optimization techniques and allows the calculation of clusters stepbystep. This approach allows us to find as many clusters as a data set contains with respect to some tolerance. Feature selection is crucial for a gene expression database. Our feature selection algorithm is based on calculating overlaps of different genes. The database used, contains over 16 000 genes and this number is considerably reduced by feature selection. We propose a classification algorithm where each tissue sample is considered as the center of a cluster which is a ball. The results of numerical experiments confirm that the classification algorithm in combination with the feature selection algorithm perform slightly better than the published results for multiclass classifiers based on support vector machines for this data set. Availability: Available on request from the authors. Contact:
A new theoretical framework for Kmeanstype clustering
 FOUNDATIONS AND ADVANCES IN DATA MINING
, 2005
"... One of the fundamental clustering problems is to assign n points into k clusters based on the minimal sumofsquares(MSSC), which is known to be NPhard. In this paper, by using matrix arguments, we first model MSSC as a socalled 01 semidefinite programming (SDP). The classical Kmeans algorithm c ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
One of the fundamental clustering problems is to assign n points into k clusters based on the minimal sumofsquares(MSSC), which is known to be NPhard. In this paper, by using matrix arguments, we first model MSSC as a socalled 01 semidefinite programming (SDP). The classical Kmeans algorithm can be interpreted as a special heuristics for the underlying 01 SDP. Moreover, the 01 SDP model can be further approximated by the relaxed and polynomially solvable linear and semidefinite programming. This opens new avenues for solving MSSC. The 01 SDP model can be applied not only to MSSC, but also to other scenarios of clustering as well. In particular, we show that the recently proposed normalized kcut and spectral clustering can also be embedded into the 01 SDP model in various kernel spaces.
A coordinate gradient descent method for ℓ1regularized convex minimization
 Department of Mathematics, National University of Singapore
, 2008
"... In applications such as signal processing and statistics, many problems involve finding sparse solutions to underdetermined linear systems of equations. These problems can be formulated as a structured nonsmooth optimization problems, i.e., the problem of minimizing ℓ1regularized linear least squa ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
In applications such as signal processing and statistics, many problems involve finding sparse solutions to underdetermined linear systems of equations. These problems can be formulated as a structured nonsmooth optimization problems, i.e., the problem of minimizing ℓ1regularized linear least squares problems. In this paper, we propose a block coordinate gradient descent method (abbreviated as CGD) to solve the more general ℓ1regularized convex minimization problems, i.e., the problem of minimizing an ℓ1regularized convex smooth function. We establish a Qlinear convergence rate for our method when the coordinate block is chosen by a GaussSouthwelltype rule to ensure sufficient descent. We propose efficient implementations of the CGD method and report numerical results for solving largescale ℓ1regularized linear least squares problems arising in compressed sensing and image deconvolution as well as largescale ℓ1regularized logistic regression problems for feature selection in data classification. Comparison with several stateoftheart algorithms specifically designed for solving largescale ℓ1regularized linear least squares or logistic regression problems suggests that an efficiently implemented CGD method may outperform these algorithms despite the fact that the CGD method is not specifically designed just to solve these special classes of problems. Key words. Coordinate gradient descent, Qlinear convergence, ℓ1regularization, compressed sensing, image deconvolution, linear least squares, logistic regression, convex optimization
Optimal parameter selection in support vector machines
 Journal of Industrial and Management Optimization
, 2005
"... Abstract. The purpose of the paper is to apply a nonlinear programming algorithm for computing kernel and related parameters of a support vector machine (SVM) by a twolevel approach. Available training data are split into two groups, one set for formulating a quadratic SVM with L2soft margin and a ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
(Show Context)
Abstract. The purpose of the paper is to apply a nonlinear programming algorithm for computing kernel and related parameters of a support vector machine (SVM) by a twolevel approach. Available training data are split into two groups, one set for formulating a quadratic SVM with L2soft margin and another one for minimizing the generalization error, where the optimal SVM variables are inserted. Subsequently, the total generalization error is evaluated for a separate set of test data. Derivatives of functions by which the optimization problem is defined, are evaluated in an analytical way, where an existing Cholesky decomposition needed for solving the quadratic SVM, is exploited. The approach is implemented and tested on a couple of standard data sets with up to 4,800 patterns. The results show a significant reduction of the generalization error, an increase of the margin, and a reduction of the number of support vectors in all cases where the data sets are sufficiently large. By a second set of test runs, kernel parameters are assigned to individual features. Redundant attributes are identified and suitable relative weighting factors are computed. 1. Introduction. During the last ten years, support vector machines (SVM) became an important alternative to neural networks for machine learning. Meanwhile there is a large number of applications, see e.g.
Parallel Inductive Logic in Data Mining
, 2000
"... Datamining is the process of automatic extraction of novel, useful and understandable patterns from very large databases. Highperformance, scalable, and parallel computing algorithms are crucial in data mining as datasets grow inexorably in size and complexity. Inductive logic is a research area i ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Datamining is the process of automatic extraction of novel, useful and understandable patterns from very large databases. Highperformance, scalable, and parallel computing algorithms are crucial in data mining as datasets grow inexorably in size and complexity. Inductive logic is a research area in the intersection of machine learning and logic programming, which has been recently applied to data mining. Inductive logic studies learning from examples, within the framework provided by clausal logic. It provides a uniform and very expressive means of representation: All examples, background knowledge as well as the induced theory are expressed in rstorder logic. However, such an expressive representation is often computationally expensive. This report first presents the background for parallel data mining, the BSP model, and inductive logic programming. Based on the study, this report gives an approach to parallel inductive logic in data mining that solves the potential performance problem. Both parallel algorithm and cost analysis are provided. This approach is applied to a number of problems and it shows a superlinear speedup. To justify this analysis, I implemented a parallel version of a core ILP system { Progol { in C with the support of the BSP parallel model. Three test cases are provided and a double speedup
A Dependent LProunding Approach for the kMedian Problem
"... Abstract. In this paper, we revisit the classical kmedian problem: Given n points in a metric space, select k centers so as to minimize the sum of distances of points to their closest center. Using the standard LP relaxation for kmedian, we give an efficient algorithm to construct a probability di ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
(Show Context)
Abstract. In this paper, we revisit the classical kmedian problem: Given n points in a metric space, select k centers so as to minimize the sum of distances of points to their closest center. Using the standard LP relaxation for kmedian, we give an efficient algorithm to construct a probability distribution on sets of k centers that matches the marginals specified by the optimal LP solution. Our algorithm draws inspiration from clustering and randomized rounding approaches that have been used previously for kmedian and the closely related facility location problem, although ensuring that we choose at most k centers requires a careful dependent rounding procedure. Analyzing the approximation ratio of our algorithm presents significant technical difficulties: we are able to show an upper bound of 3.25. While this is worse than the current best known 3 + ϵ guarantee of [2], our approach is interesting because: (1) The random choice of the k centers given by the algorithm keeps the marginal distributions and satisfies the negative correlation, leading to 3.25 approximation algorithms for some generalizations of the kmedian problem, including the kUFL problem introduced in [8], (2) our algorithm runs in Õ(k3 n 2) time compared to the O(n 8) time required by the local search algorithm of [2] to guarantee a 3.25 approximation, and (3) our approach has the potential to beat the decade old bound of 3 + ϵ for kmedian by a suitable instantiation of various parameters in the algorithm. We also give a 34approximation for the knapsack median problem, which greatly improves the approximation constant in [11]. Besides the improved approximation ratio, both our algorithm and analysis are simple, compared to [11]. Using the same technique, we also give a 9approximation for matroid median problem introduced in [9], improving on their 16approximation. 1
Oblique multicategory decision trees using nonlinear programming
 INFORMS Journal on Computing
"... informs ® doi 10.1287/ijoc.1030.0047 © 2005 INFORMS Induction of decision trees is a popular and effective method for solving classification problems in datamining applications. This paper presents a new algorithm for multicategory decision tree induction based on nonlinear programming. This algor ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
informs ® doi 10.1287/ijoc.1030.0047 © 2005 INFORMS Induction of decision trees is a popular and effective method for solving classification problems in datamining applications. This paper presents a new algorithm for multicategory decision tree induction based on nonlinear programming. This algorithm, termed OCSEP (Oblique Category SEParation), combines the advantages of several other methods and shows improved generalization performance on a collection of realworld data sets.
Scalable optimizationbased feature selection with application to recommender systems
, 2003
"... ..."
Interactive Machine Learning for Refinement and Analysis of Segmented CT/MRI Images
, 2004
"... This dissertation concerns the development of an interactive machine learning method for refinement and analysis of segmented computed tomography (CT) images. This method uses higherlevel domaindependent knowledge to improve initial image segmentation results. A knowledgebased refinement and anal ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
This dissertation concerns the development of an interactive machine learning method for refinement and analysis of segmented computed tomography (CT) images. This method uses higherlevel domaindependent knowledge to improve initial image segmentation results. A knowledgebased refinement and analysis system requires the formulation of domain knowledge. A serious problem faced by knowledgebased system designers is the knowledge acquisition bottleneck. Knowledge acquisition is very challenging and an active research topic in the field of machine learning and artificial intelligence. Commonly, a knowledge engineer needs to have a domain expert to formulate acquired knowledge for use in an expert system. That process is rather tedious and errorprone. The domain expert’s verbal description can be inaccurate or incomplete, and the knowledge engineer may not correctly interpret the expert’s intent. In many cases, the domain experts prefer to do actions instead of explaining their expertise. These problems motivate us to find another solution to make the knowledge acquisition process less challenging. Instead of trying to acquire expertise from a domain expert verbally, we can ask him/her to show expertise through actions that can be observed by the system.
A Cutting Plane Method for the MinimumSumofSquared Error Clustering
 PROCEEDING OF SIAM INTERNATIONAL CONFERENCE ON DATA MINING
, 2005
"... The minimumsumofsquared error clustering (MSSC) is one of the most intuitive and popular clustering algorithms. In this paper, we first show that MSSC can be equivalently cast as a concave minimization problem. To find the global solution of MSSC, we construct a procedure to move from a fraction ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
(Show Context)
The minimumsumofsquared error clustering (MSSC) is one of the most intuitive and popular clustering algorithms. In this paper, we first show that MSSC can be equivalently cast as a concave minimization problem. To find the global solution of MSSC, we construct a procedure to move from a fractional solution of the relaxed minimization problem to an integer solution with improved objective value. Then we adapt Tuy’s convexity cut method, a powerful algorithm for global concave optimization, to find a global optimum to MSSC. Promising numerical examples are reported.