#### DMCA

## Coda: High dimensional copula discriminant analysis (2012)

Venue: | Journal of Machine Learning Research |

Citations: | 9 - 3 self |

### Citations

1549 | The Elements of
- Hastie, Tibshirani, et al.
- 2009
(Show Context)
Citation Context ...to n0/n. We have n0 ∼Binomial(n,pi0) and n1 ∼Binomial(n,pi1). In the sequel, without loss of generality, we assume that pi0 = pi1 = 1/2. The extension to the case where pi0 6= pi1 is straightforward (=-=Hastie et al., 2001-=-). Define µ̂0 = 1 n0 ∑ i:yi=−n1/n xi, µ̂1 = 1 n1 ∑ i:yi=n0/n xi, µ̂d = µ̂1− µ̂0, µ̂= 1 n ∑ i xi, S0 = 1 n0 ∑ i:yi=−n1/n (xi− µ̂0)(xi− µ̂0)T , S1 = 1 n1 ∑ i:yi=n0/n (xi− µ̂1)(xi− µ̂1)T , Sb = 1 n 1 ∑ i... |

865 | The Dantzig selector: statistical estimation when p is much larger than n - Candès, Tao |

722 | Regularization paths for generalized linear models via coordinate descent - Friedman, Hastie, et al. |

650 | Diagnosis of multiple cancer types by shrunken centroids of gene expression - Tibshirani - 2002 |

626 | Empirical Processes with Applications to Statistics - Shorack, Wellner - 1986 |

592 | Sparse inverse covariance estimation with the graphical lasso - Friedman, Hastie, et al. |

473 | On model selection consistency of Lasso - Zhao, Yu |

220 | Optimal aggregation of classifiers in statistical learning - Tsybakov - 2004 |

214 | Discriminant analysis by gaussian mixtures
- Hastie, Tibshirani
- 1996
(Show Context)
Citation Context ...riance estimator to classification (Fan et al., 2010; Shao et al., 2011; Cai and Liu, 2012; Witten and Tibshirani, 2011; Mai et al., 2012); (3) How to deal with non-Gaussian data (Lin and Jeon, 2003; =-=Hastie and Tibshirani, 1996-=-). In this paper, we propose a high dimensional classification method, named the Copula Discriminant Analysis (CODA), which addresses all the above three questions. To handle non-Gaussian data, we ext... |

163 | Sparse permutation invariant covariance estimation. - Rothman, Bickel, et al. - 2008 |

159 | Least angle regression’, Annals of Statistics - Efron, Hastie, et al. - 2004 |

128 | Some theory for fisher’s linear discriminant function, ’naive bayes’, and some alternatives when there are many more variables than observations - Bickel, Levina |

120 |
Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator
- Dvoretzky, Kiefer, et al.
- 1956
(Show Context)
Citation Context ...′(ξn)≤C(Φ−1)′(Fj(g j( √ βi logn))) = C φ( √ βi logn) ≤ c1nβi/2, (31) where C > 1 and c1 are generic constants. Specifically, when i = 0, using the Dvoretzky-KieferWolfowitz inequality (Massart, 1990; =-=Dvoretzky et al., 1956-=-), from Equation (29), we have sup t∈I0n ∣∣∣Φ−1(F̃j(t))−Φ−1(Fj(t))∣∣∣= OP (√ log logn n1−α ) . For any i ∈ {1, . . . ,κ}, using Lemma 21, for large enough n, sup t∈Iin ∣∣∣F̃j(t)−Fj(t)∣∣∣ = OP (√ log l... |

117 | The tight constant in the Dvoretzky-Kiefer-Wolfowitz inequality - Massart - 1990 |

114 | Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis - Leek, Storey - 2007 |

107 | Ordinal measures of association. - Kruskal - 1958 |

92 | The nonparanormal: Semiparametric estimation of high dimensional undirected graphs. - Liu, Lafferty, et al. - 2009 |

79 | High-dimensional classification using features annealed independence - Fan, Fan |

60 | Sparse inverse covariance selection via alternating linearization methods. - Scheinberg, Ma, et al. - 2010 |

58 |
Correlations and Copulas for Decision and Risk Analysis,
- Clemen, Reilly
- 1999
(Show Context)
Citation Context ...(X j) = E( f j(X j)) = µ j; Var(X j) = Var( f j(X j)) = σ 2 j . In summary, we denote by suchX ∼ NPN(µ,Σ, f ). Liu et al. (2009) prove that the nonparanormal is highly related to the Gaussian Copula (=-=Clemen and Reilly, 1999-=-; Klaassen and Wellner, 1997). 632 COPULA DISCRIMINANT ANALYSIS 2.2 Correlation Matrix and Transformation Functions Estimations Liu et al. (2009) suggest a normal-score based correlation coefficient m... |

57 |
Introduction to Linear Algebra. WellesleyCambridge Press, 3rd edition edition,
- Strang
- 1998
(Show Context)
Citation Context ...erefore β∗∗ is also sparse, and hence β∗∗S = (CSS) −1(µ1−µ0)S. Lemma 6 Let β∗ =Σ−1µd . β∗∗ is proportional to β∗. Especially, we have β∗∗ = 4β∗ 4+µTdΣ −1µd . Proof Using the Binomial inverse theorem (=-=Strang, 2003-=-), we have β∗∗ = (Σ+ 1 4 µdµ T d ) −1µd = ( Σ −1− 1 4 Σ −1µdµTdΣ −1 1+ 1 4 µTdΣ −1µd ) µd =Σ −1µd− 1 4 Σ −1µd(µTdΣ −1µd) 1+ 1 4 µTdΣ −1µd = ( 1− µ T dΣ −1µd 4+µTdΣ −1µd ) Σ −1µd = 4β∗ 4+µTdΣ −1µd . Th... |

55 | High dimensional inverse covariance matrix estimation via linear programming. - Yuan - 2010 |

51 | Highdimensional semiparametric Gaussian copula graphical models.
- Liu, Han, et al.
- 2012
(Show Context)
Citation Context ...ing the Spearman’s rho and Kendall’s tau, which are invariant to the strictly increasing functions f j. They have been shown to enjoy the optimal parametric rate in estimating the correlation matrix (=-=Liu et al., 2012-=-; Xue and Zou, 2012). Unlike previous analysis, a new contribution of this paper is that we provide an extra condition on the transformation functions which guarantees the fast rates of convergence of... |

47 |
Numerical Optimization, 2nd ed
- Nocedal, Wright
- 2006
(Show Context)
Citation Context ...an et al., 2009, 2010) or lars (Efron et al., 2004) can be applied. When ν goes to infinity, the Equation (13) reduces to the ROAD, which can be efficiently solved by the augmented Lagrangian method (=-=Nocedal and Wright, 2006-=-). More specifically, we define the augmented Lagrangian function: L(β,u) = 1 2 βT Ŝβ+λ‖β‖1+νu(µ̂Tβ−1)+ ν 2 (µ̂Tβ−1)2, where u∈R is the rescaled Lagrangian multiplier and ν> 0 is the augmented Lagran... |

47 | Functional network organization of the human brain. Neuron 72: 665–678 - Power, Cohen, et al. - 2011 |

44 | Hybrid huberized support vector machines for microarray classification and gene selection, - Wang, Zhu, et al. - 2008 |

43 | Model selection in gaussian graphical models: High-dimensional consistency of l1-regularized mle.
- Ravikumar, Raskutti, et al.
- 2008
(Show Context)
Citation Context ...plications. There are three issues with regard to high dimensional linear discriminant analysis: (1) How to estimate Σ and Σ−1 accurately and efficiently (Rothman et al., 2008; Friedman et al., 2007; =-=Ravikumar et al., 2009-=-; Scheinberg et al., 2010); (2) How to incorporate the covariance estimator to classification (Fan et al., 2010; Shao et al., 2011; Cai and Liu, 2012; Witten and Tibshirani, 2011; Mai et al., 2012); (... |

42 | A constrained l1 minimization approach to sparse precision matrix estimation - Cai, Liu, et al. |

40 | Efficient estimation in the bivariate normal copula model: normal margins are least favourable
- Klaassen, Wellner
- 1997
(Show Context)
Citation Context ...j; Var(X j) = Var( f j(X j)) = σ 2 j . In summary, we denote by suchX ∼ NPN(µ,Σ, f ). Liu et al. (2009) prove that the nonparanormal is highly related to the Gaussian Copula (Clemen and Reilly, 1999; =-=Klaassen and Wellner, 1997-=-). 632 COPULA DISCRIMINANT ANALYSIS 2.2 Correlation Matrix and Transformation Functions Estimations Liu et al. (2009) suggest a normal-score based correlation coefficient matrix to estimate Σ0. More s... |

36 | Frozen robust multiarray analysis (frma),” - McCall, Bolstad, et al. - 2010 |

34 | Penalized classification using fisher’s linear discriminant.
- Witten, Tibshirani
- 2011
(Show Context)
Citation Context ...08; Friedman et al., 2007; Ravikumar et al., 2009; Scheinberg et al., 2010); (2) How to incorporate the covariance estimator to classification (Fan et al., 2010; Shao et al., 2011; Cai and Liu, 2012; =-=Witten and Tibshirani, 2011-=-; Mai et al., 2012); (3) How to deal with non-Gaussian data (Lin and Jeon, 2003; Hastie and Tibshirani, 1996). In this paper, we propose a high dimensional classification method, named the Copula Disc... |

32 | glmnet: Lasso and elastic-net regularized generalized linear models, 2013. URL http://CRAN.R-project.org/package=glmnet. R package version - Friedman, Hastie, et al. |

32 | Regularized rank-based estimation of highdimensional nonparanormal graphical models.
- Xue, L, et al.
- 2012
(Show Context)
Citation Context ... rho and Kendall’s tau, which are invariant to the strictly increasing functions f j. They have been shown to enjoy the optimal parametric rate in estimating the correlation matrix (Liu et al., 2012; =-=Xue and Zou, 2012-=-). Unlike previous analysis, a new contribution of this paper is that we provide an extra condition on the transformation functions which guarantees the fast rates of convergence of the marginal mean ... |

31 | A direct estimation approach to sparse linear discriminant analysis. - Cai, Liu - 2011 |

24 | Sparse linear discriminant analysis by thresholding for high dimensional data. The Annals of Statistics,
- Shao, Wang, et al.
- 2011
(Show Context)
Citation Context ... are based on a working independence assumption. Recently, numerous alternative approaches have been proposed by taking more complex covariance matrix structures into consideration (Fan et al., 2010; =-=Shao et al., 2011-=-; Cai and Liu, 2012; Mai et al., 2012). c©2013 Fang Han, Tuo Zhao and Han Liu. HAN, ZHAO AND LIU A binary classification problem can be formulated as follows: suppose that we have a training set {(xi,... |

21 | Cyclin D1/bcl-1 cooperates with myc genes in the generation of B-cell lymphoma in transgenic mice. - Lovec, Grzeschiczek, et al. - 1994 |

21 | The huge package for highdimensional undirected graph estimation in R. - Zhao, Liu, et al. - 2012 |

19 |
A direct approach to sparse discriminant analysis in ultra-high dimensions.
- Mai, Zou, et al.
- 2012
(Show Context)
Citation Context ...ssumption. Recently, numerous alternative approaches have been proposed by taking more complex covariance matrix structures into consideration (Fan et al., 2010; Shao et al., 2011; Cai and Liu, 2012; =-=Mai et al., 2012-=-). c©2013 Fang Han, Tuo Zhao and Han Liu. HAN, ZHAO AND LIU A binary classification problem can be formulated as follows: suppose that we have a training set {(xi,yi), i = 1, ...,n} independently draw... |

19 | thresholds for high-dimensional and noisy sparsity recovery using `1-constrained quadratic programming (Lasso), - Sharp - 2009 |

18 | Improved centroids estimation for the nearest shrunken centroid classifier - Wang, Zhu - 2007 |

15 | et al., “Confirmation of the Molecular Classification of Diffuse Large B-Cell Lymphoma by Immunohistochemistry Using a Tissue Microarray - Hans, Weissenburger, et al. |

14 | Fast algorithms for the calculation of Kendall’s τ. - Christensen - 2005 |

8 | Discriminant analysis through a semi-parametric model.
- Lin, Jeon
- 2003
(Show Context)
Citation Context ...incorporate the covariance estimator to classification (Fan et al., 2010; Shao et al., 2011; Cai and Liu, 2012; Witten and Tibshirani, 2011; Mai et al., 2012); (3) How to deal with non-Gaussian data (=-=Lin and Jeon, 2003-=-; Hastie and Tibshirani, 1996). In this paper, we propose a high dimensional classification method, named the Copula Discriminant Analysis (CODA), which addresses all the above three questions. To han... |

6 | ESVM: Evolutionary support vector machine for automatic feature selection and classification of microarray data - Huang, Chang |

3 |
et al. Automated diagnoses of attention deficit hyperactive disorder using magnetic resonance imaging
- Eloyan, Muschelli, et al.
- 2012
(Show Context)
Citation Context ...thods on the testing data set based on 1,000 replications. 5.3 Brain Imaging Data In this section we investigate the performance of several methods on a brain imaging data set, the ADHD 200 data set (=-=Eloyan et al., 2012-=-). The ADHD 200 data set is a landmark study compiling over 1,000 functional and structural scans including subjects with and without attention deficit hyperactive disorder (ADHD). The current release... |

3 | et al. Cell-type independent myc target genes reveal a primordial signature involved in biomass accumulation. PloS One - Ji, Wu, et al. |

3 | Antisense c-myc and immunostimulatory oligonucleotide inhibition of tumorigenesis in a murine b-cell lymphoma transplant model
- Smith, Wickstrom
- 1998
(Show Context)
Citation Context ...best overall performance. Some biological discoveries have also been verified in this process. For example, the MYC gene has been discovered to be relevant to the b cell lymphoma (Lovec et al., 1994; =-=Smith and Wickstrom, 1998-=-) and has recently been found to be associated with the Wilms tumor (Ji et al., 2011). This gene is also constantly selected by the CODA methods in classifying b cell lymphoma and Wilms tumor with the... |

2 |
A road to classification in high dimensional space. Arxiv preprint arXiv:1011.6095
- Fan, Feng, et al.
- 2010
(Show Context)
Citation Context ...an and Fan, 2008), are based on a working independence assumption. Recently, numerous alternative approaches have been proposed by taking more complex covariance matrix structures into consideration (=-=Fan et al., 2010-=-; Shao et al., 2011; Cai and Liu, 2012; Mai et al., 2012). c©2013 Fang Han, Tuo Zhao and Han Liu. HAN, ZHAO AND LIU A binary classification problem can be formulated as follows: suppose that we have a... |

1 | Empirical Processes in M-estimation, volume 105. Cambridge university press - Geer - 2000 |