Results 11  20
of
69
THE ASYMPTOTIC DISTRIBUTION AND BERRY–ESSEEN BOUND OF A NEW TEST FOR INDEPENDENCE IN HIGH DIMENSION WITH AN APPLICATION TO STOCHASTIC OPTIMIZATION
, 901
"... Let X1,...,Xn be a random sample from a pdimensional population distribution. Assume that c1n α ≤ p ≤ c2n α for some positive constants c1,c2 and α. In this paper we introduce a new statistic for testing independence of the pvariates of the population and prove that the limiting distribution is th ..."
Abstract

Cited by 16 (1 self)
 Add to MetaCart
(Show Context)
Let X1,...,Xn be a random sample from a pdimensional population distribution. Assume that c1n α ≤ p ≤ c2n α for some positive constants c1,c2 and α. In this paper we introduce a new statistic for testing independence of the pvariates of the population and prove that the limiting distribution is the extreme distribution of type I with a rate of convergence O((log n) 5/2 / √ n). This is much faster than O(1/log n), a typical convergence rate for this type of extreme distribution. A simulation study and application to stochastic optimization are discussed.
Description of the minimizers of least squares regularized with ℓ0norm. Uniqueness of the global minimizer
 SIAM J. IMAGING SCIENCES
, 2013
"... ..."
(Show Context)
When do stepwise algorithms meet subset selection criteria?
, 2007
"... Recent results in homotopy and solution paths demonstrate that certain welldesigned greedy algorithms, with a range of values of the algorithmic parameter, can provide solution paths to a sequence of convex optimization problems. On the other hand, in regression many existing criteria in subset sel ..."
Abstract

Cited by 13 (3 self)
 Add to MetaCart
(Show Context)
Recent results in homotopy and solution paths demonstrate that certain welldesigned greedy algorithms, with a range of values of the algorithmic parameter, can provide solution paths to a sequence of convex optimization problems. On the other hand, in regression many existing criteria in subset selection (including Cp, AIC, BIC, MDL, RIC, etc.) involve optimizing an objective function that contains a counting measure. The two optimization problems are formulated as (P1) and (P0) in the present paper. The latter is generally combinatoric and has been proven to be NPhard. We study the conditions under which the two optimization problems have common solutions. Hence, in these situations a stepwise algorithm can be used to solve the seemingly unsolvable problem. Our main result is motivated by recent work in sparse representation, while two others emerge from different angles: a direct analysis of sufficiency and necessity and a condition on the mostly correlated covariates. An extreme example connected with least angle regression is of independent interest.
New efficient estimation and variable selection methods for semiparametric varyingcoefficient partially linear models. Ann
 Statist. 39 305–332. MR2797848 SPARSE HIGHDIMENSIONAL VCM 1299
, 2011
"... The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. In this work, we propose new estimation and variable selection procedures for the semiparametric varyingcoefficient partially linear model. We f ..."
Abstract

Cited by 12 (1 self)
 Add to MetaCart
The complexity of semiparametric models poses new challenges to statistical inference and model selection that frequently arise from real applications. In this work, we propose new estimation and variable selection procedures for the semiparametric varyingcoefficient partially linear model. We first study quantile regression estimates for the nonparametric varyingcoefficient functions and the parametric regression coefficients. To achieve nice efficiency properties, we further develop a semiparametric composite quantile regression procedure. We establish the asymptotic normality of proposed estimators for both the parametric and nonparametric parts and show that the estimators achieve the best convergence rate. Moreover, we show that the proposed method is much more efficient than the leastsquaresbased method for many nonnormal errors and that it only loses a small amount of efficiency for normal errors. In addition, it is shown that the loss in efficiency is at most 11.1 % for estimating varying coefficient functions and is no greater than 13.6 % for estimating parametric components. To achieve sparsity with highdimensional covariates, we propose adaptive penalization methods for variable selection in the semiparametric varyingcoefficient partially linear model and prove that the methods possess the oracle property. Extensive Monte Carlo simulation studies are conducted to examine the finitesample performance of the proposed procedures. Finally, we apply the new methods to analyze the plasma betacarotene level data. 1. Introduction. Semiparametric
Statistical significance of the Netflix challenge
 URL http: //arxiv.org/abs/1207.5649
"... ar ..."
An improved 1norm SVM for simultaneous classification and variable selection
 Proceedings of the 11th International Conference on Artificial Intelligence and Statistics
, 2007
"... We propose a novel extension of the 1norm support vector machine (SVM) for simultaneous feature selection and classification. The new algorithm penalizes the empirical hinge loss by the adaptively weighted 1norm penalty in which the weights are computed by the 2norm SVM. Hence the new algorithm i ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
We propose a novel extension of the 1norm support vector machine (SVM) for simultaneous feature selection and classification. The new algorithm penalizes the empirical hinge loss by the adaptively weighted 1norm penalty in which the weights are computed by the 2norm SVM. Hence the new algorithm is called the hybrid SVM. Simulation and real data examples show that the hybrid SVM not only often improves upon the 1norm SVM in terms of classification accuracy but also enjoys better feature selection performance. 1
Feature selection when there are many influential features. Arxiv preprint arXiv:0911.4076
, 2009
"... SUMMARY. Recent discussion of the success of feature selection methods has argued that focusing on a relatively small number of features has been counterproductive. Instead, it is suggested, the number of significant features can be in the thousands or tens of thousands, rather than (as is commonly ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
SUMMARY. Recent discussion of the success of feature selection methods has argued that focusing on a relatively small number of features has been counterproductive. Instead, it is suggested, the number of significant features can be in the thousands or tens of thousands, rather than (as is commonly supposed at present) approximately in the range from five to fifty. This change, in orders of magnitude, in the number of influential features, necessitates alterations to the way in which we choose features and to the manner in which the success of feature selection is assessed. In this paper we suggest a general approach that is suited to cases where the number of relevant features is very large, and we consider particular versions of the approach in detail. We propose ways of measuring performance, and we study both theoretical and numerical properties of the proposed methodology.
Extended BIC for smallnlargep sparse GLM
"... The smallnlargeP situation has become common in genetics research, medical studies, risk management, and other fields. Feature selection is crucial in these studies yet poses a serious challenge. The traditional criteria such as AIC, BIC, and crossvalidation choose too many features. To overcome ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
The smallnlargeP situation has become common in genetics research, medical studies, risk management, and other fields. Feature selection is crucial in these studies yet poses a serious challenge. The traditional criteria such as AIC, BIC, and crossvalidation choose too many features. To overcome the difficulties caused by the smallnlargeP situation, Chen and Chen (2008) developed a family of extended Bayes information criteria (EBIC). Under normal linear models, EBIC is found to be consistent with nice finite sample properties. Proving consistency for nonnormal and nonlinear models poses serious technical difficulties. In this paper, through a number of novel techniques, we establish the consistency of EBIC under generalized linear models in the smallnlargeP situation. We also report simulation results and a realdata analysis to illustrate the effectiveness of EBIC for feature selection.
Highdimensional process monitoring and fault isolation via variable selection
 J. Qual. Technol
, 2009
"... Both process monitoring and fault isolation are important and challenging tasks for quality control and improvement in highdimensional processes. Under a practical assumption that not all variables would shift simultaneously, this paper proposes a variableselectionbased multivariate statistical p ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Both process monitoring and fault isolation are important and challenging tasks for quality control and improvement in highdimensional processes. Under a practical assumption that not all variables would shift simultaneously, this paper proposes a variableselectionbased multivariate statistical process control (SPC) procedure for process monitoring and fault diagnosis. A forwardselection algorithm is first utilized to screen out potential outofcontrol variables; a multivariate control chart is then set up to monitor suspicious variables. Therefore, detection of faulty conditions and isolation of faulty variables can be achieved in one step. Both simulation studies and a real example have shown the effectiveness of the proposed procedure.
Fast dependencyaware feature selection in veryhighdimensional pattern recognition
 In Systems, Man, and Cybernetics (SMC) 2011 IEEE International Conference on
, 2011
"... AbstractThe paper addresses the problem of making dependencyaware feature selection feasible in pattern recognition problems of very high dimensionality. The idea of individually best ranking is generalized to evaluate the contextual quality of each feature in a series of randomly generated featu ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
(Show Context)
AbstractThe paper addresses the problem of making dependencyaware feature selection feasible in pattern recognition problems of very high dimensionality. The idea of individually best ranking is generalized to evaluate the contextual quality of each feature in a series of randomly generated feature subsets. Each random subset is evaluated by a criterion function of arbitrary choice (permitting functions of high complexity). Eventually, the novel dependencyaware feature rank is computed, expressing the average benefit of including a feature into feature subsets. The method is efficient and generalizes well especially in veryhighdimensional problems, where traditional contextaware feature selection methods fail due to prohibitive computational complexity or to overfitting. The method is shown well capable of overperforming the commonly applied individual ranking which ignores important contextual information contained in data.