MetaCartSign in to MyCiteSeer

Include Citations | Advanced Search | Help

Include Citations | Advanced Search | Help

  GENE SELECTION USING LOGISTIC REGRESSIONS BASED ON AIC, BIC AND MDL CRITERIA

Download:
Download as a PDF
by Xiaobo Zhou, Xiaodong Wang, Edward R. Dougherty, X. Zhou, X. Wang, E. R. Dougherty
http://gsplab.tamu.edu/Publications/PDFpapers/pap_NMNC_selection.pdf
Add To MetaCart

Abstract:

In microarray-based cancer classification, gene selection is an important issue owing to the large number of variables (gene expressions) and the small number of experimental conditions. Many gene-selection and classification methods have been proposed; however most of these treat gene selection and classification separately, and not under the same model. We propose a Bayesian approach to gene selection using the logistic regression model. The Akaike information criterion (AIC), the Bayesian information criterion (BIC) and the minimum description length (MDL) principle are used in constructing the posterior distribution of the chosen genes. The same logistic regression model is then used for cancer classification. Fast implementation issues for these methods are discussed. The proposed methods are tested on several data sets including those arising from hereditary breast cancer, small round blue-cell tumors, lymphoma, and acute leukemia. The experimental results indicate that the proposed methods show high classification accuracies on these data sets. Some robustness and sensitivity properties of the proposed methods are also discussed. Finally, mixing logistic-regression based gene selection with other classification methods and mixing logistic-regression-based classification with other gene-selection methods are considered.

Citations

841 Estimating the dimension of a model – Schwarz - 1978
511 Molecular classification of cancer: class discovery and class prediction by gene expression monitoring – Goloub, Slonim, et al. - 1999
262 Gene selection for cancer classification using support vector machines – Guyon, Weston, et al. - 2002
222 Econometric Analysis – Greene - 2001
199 Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling – Alizadeh, Eisen, et al.
191 Comparison of discrimination methods for the classification of tumors using gene expression data – Dudoit, Fridlyand, et al. - 2002
113 Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks – Khan, Wei, et al. - 2001
65 Gene-expression profiles in hereditary breast – Hedenfalk, Duggan, et al. - 2001
61 Stochastic Complexity and Statistical Inquiry (World Scientific – Rissanen - 1989
61 Feature Selection for HighDimensional Genomic Microarray Data – Xing, Jordan, et al. - 2001
48 Tumor classification by partial least squares using microarray gene expression data – Nguyen, Rocke - 2002
46 Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method – Li - 2001
22 Multi-class cancer classification via partial least squares with gene expression profiles – Nguyen, Rocke - 2002
21 Gene selection: a Bayesian variable selection approach – Lee, Sha, et al. - 2003
16 How many genes are needed for a discriminant microarray data analysis – Li, Yang - 2000
12 Effective Dimension Reduction Methods for Tumor Classification using gene Expression Data – Antoniadis, Lambert-Lacroix, et al. - 2003
11 Prior elicitation, variable selection and Bayesian computation for logistic regression models – Chen, Ibrahim, et al. - 1999
10 Simultaneous gene clustering and subset selection for sample classification via – Jornsten, Yu - 2003
5 A new look at the statistical model indentification – Akaike - 1974
5 Classification and feature gene selection using the normalized maximum likelihood model for discrete regression – Tabus, Rissenan, et al. - 2003
4 Binarization of microarray data based on a mixture model – Zhou, Wang, et al. - 2003
3 Nonlinear-probit gene classification using mutual-information and wavelet-based feature selection – Zhou, Wang, et al. - 2003
2 Missing value estimation based on linear and non-linear regression with Bayesian gene selection, Bioinformatics 19 – Zhou, Wang, et al. - 2003
1 Strong feature sets from small samples, Computational Biology 9 – Kim, Dougherty, et al. - 2002
1 Using MCMC for logistic regression model selection involving large number of candidate models – Qian, Field