Results 11 - 20
of
29
Taking Advantage of Sparsity in Multi-Task Learning
"... We study the problem of estimating multiple linear regression equations for the purpose of both prediction and variable selection. Following recent work on multi-task learning [1], we assume that the sparsity patterns of the regression vectors are included in the same set of small cardinality. This ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
We study the problem of estimating multiple linear regression equations for the purpose of both prediction and variable selection. Following recent work on multi-task learning [1], we assume that the sparsity patterns of the regression vectors are included in the same set of small cardinality. This assumption leads us to consider the Group Lasso as a candidate estimation method. We show that this estimator enjoys nice sparsity oracle inequalities and variable selection properties. The results hold under a certain restricted eigenvalue condition and a coherence condition on the design matrix, which naturally extend recent work in [3, 19]. In particular, in the multi-task learning scenario, in which the number of tasks can grow, we are able to remove completely the effect of the number of predictor variables in the bounds. Finally, we show how our results can be extended to more general noise distributions, of which we only require the variance to be finite. 1 1
DIMENSION REDUCTION AND VARIABLE SELECTION IN CASE CONTROL STUDIES VIA REGULARIZED LIKELIHOOD OPTIMIZATION
"... Abstract. Dimension reduction and variable selection are performed routinely in case-control studies, but the literature on the theoretical aspects of the resulting estimates is scarce. We bring our contribution to this literature by studying estimators obtained via ℓ1 penalized likelihood optimizat ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Abstract. Dimension reduction and variable selection are performed routinely in case-control studies, but the literature on the theoretical aspects of the resulting estimates is scarce. We bring our contribution to this literature by studying estimators obtained via ℓ1 penalized likelihood optimization. We show that the optimizers of the ℓ1 penalized retrospective likelihood coincide with the optimizers of the ℓ1 penalized prospective likelihood. This extends the results of Prentice and Pyke (1979), obtained for non-regularized likelihoods. We establish both the sup-norm consistency of the odds ratio, after model selection, and the consistency of subset selection of our estimators. The novelty of our theoretical results consists in the study of these properties under the case-control sampling scheme. Our results hold for selection performed over a large collection of candidate variables, with cardinality allowed to depend and be greater than the sample size. We complement our theoretical results with a novel approach of determining data driven tuning parameters, based on the bisection method. The resulting procedure offers significant computational savings when compared with grid search based methods. All our numerical experiments support strongly our theoretical findings. 1.
Penalized Sieve Estimation and Inference of Semi-nonparametric Dynamic Models: A Selective Review
, 2011
"... In this selective review, we …rst provide some empirical examples that motivate the usefulness of semi-nonparametric techniques in modelling economic and …nancial time series. We describe popular classes of semi-nonparametric dynamic models and some temporal dependence properties. We then present pe ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
In this selective review, we …rst provide some empirical examples that motivate the usefulness of semi-nonparametric techniques in modelling economic and …nancial time series. We describe popular classes of semi-nonparametric dynamic models and some temporal dependence properties. We then present penalized sieve extremum (PSE) estimation as a general method for semi-nonparametric models with cross-sectional, panel, time series, or spatial data. The method is especially powerful in estimating di ¢ cult ill-posed inverse problems such as semi-nonparametric mixtures or conditional moment restrictions. We review recent advances on inference and large sample properties of the PSE estimators, which include (1) consistency and convergence rates of the PSE estimator of the nonparametric part; (2) limiting distributions of plug-in PSE estimators of functionals that are either smooth (i.e., root-n estimable) or non-smooth (i.e., slower than root-n estimable); (3) simple criterion-based inference for plug-in PSE estimation of smooth or non-smooth functionals; and (4) root-n asymptotic normality of semiparametric two-step estimators and their consistent variance estimators. Examples from dynamic asset pricing, nonlinear spatial VAR, semiparametric GARCH,
Smoothing ℓ1-penalized estimators for highdimensional time-course data
- Electronic Journal of Statistics
, 2007
"... Abstract: When a series of (related) linear models has to be estimated it is often appropriate to combine the different data-sets to construct more efficient estimators. We use ℓ1-penalized estimators like the Lasso or the Adaptive Lasso which can simultaneously do parameter estimation and model sel ..."
Abstract
-
Cited by 2 (2 self)
- Add to MetaCart
Abstract: When a series of (related) linear models has to be estimated it is often appropriate to combine the different data-sets to construct more efficient estimators. We use ℓ1-penalized estimators like the Lasso or the Adaptive Lasso which can simultaneously do parameter estimation and model selection. We show that for a time-course of high-dimensional linear models the convergence rates of the Lasso and of the Adaptive Lasso can be improved by combining the different time-points in a suitable way. Moreover, the Adaptive Lasso still enjoys oracle properties and consistent variable selection. The finite sample properties of the proposed methods are illustrated on simulated data and on a real problem of motif finding in DNA sequences.
Learning Exponential Families in High-Dimensions:
"... The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model. A central issue is learning these models in high-dimensions when the optimal parameter vector is sparse. This work characterizes a certain strong convexity p ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model. A central issue is learning these models in high-dimensions when the optimal parameter vector is sparse. This work characterizes a certain strong convexity property of general exponential families, which allows their generalization ability to be quantified. In particular, we show how this property can be used to analyze generic exponential families under L1 regularization. 1
Least Angle and L1 Regression: A Review ∗
, 802
"... Abstract: Least Angle Regression is a promising technique for variable selection applications, offering a nice alternative to stepwise regression. It provides an explanation for the similar behavior of LASSO (L1-penalized ..."
Abstract
- Add to MetaCart
Abstract: Least Angle Regression is a promising technique for variable selection applications, offering a nice alternative to stepwise regression. It provides an explanation for the similar behavior of LASSO (L1-penalized
2. Function space / normRegularizations
, 2009
"... • Minimize with respect to function f: X → Y: n∑ ℓ(yi,f(xi)) + i=1 Error on data λ ..."
Abstract
- Add to MetaCart
• Minimize with respect to function f: X → Y: n∑ ℓ(yi,f(xi)) + i=1 Error on data λ
BIOINFORMATICS Active Site Prediction using Evolutionary and Structural Information
, 2010
"... Motivation: The identification of catalytic residues is a key step in understanding the function of enzymes. While a variety of computational methods have been developed for this task, accuracies have remained fairly low. The best existing method exploits information from sequence and structure to a ..."
Abstract
- Add to MetaCart
Motivation: The identification of catalytic residues is a key step in understanding the function of enzymes. While a variety of computational methods have been developed for this task, accuracies have remained fairly low. The best existing method exploits information from sequence and structure to achieve a precision (the fraction of predicted catalytic residues that are catalytic) of 18.5 % at a corresponding recall (the fraction of catalytic residues identified) of 57 % on a standard benchmark. Here we present a new method, DISCERN, that provides a significant improvement over the state-ofthe-art through the use of statistical techniques to derive a model with a small set of features that are jointly predictive of enzyme active sites. Results: In cross-validation experiments on two benchmark datasets from the Catalytic Site Atlas and CATRES resources containing a total of 437 manually curated enzymes spanning 487 SCOP families, DISCERN increases catalytic site recall between 12-20% over methods that combine information from both sequence and structure, and by 50 % or more over methods that make use of sequence conservation signal only. Controlled experiments show that DISCERN’s improvement in catalytic residue prediction is derived from the combination of three ingredients: the use of the INTREPID phylogenomic method to extract conservation information; the use of 3D structure data, including features computed for residues that are proximal in the structure; and a statistical regularization procedure to prevent overfitting. Contact:

