Results 1 - 10
of
434
Missing value estimation methods for DNA microarrays
, 2001
"... Motivation: Gene expression microarray experiments can generate data sets with multiple missing expression values. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. For example, methods such as hierarchical clustering and K-means clu ..."
Abstract
-
Cited by 477 (24 self)
- Add to MetaCart
Motivation: Gene expression microarray experiments can generate data sets with multiple missing expression values. Unfortunately, many algorithms for gene expression analysis require a complete matrix of gene array values as input. For example, methods such as hierarchical clustering and K-means clustering are not robust to missing data, and may lose effectiveness even with a few missing values. Methods for imputing missing data are needed, therefore, to minimize the effect of incomplete data sets on analyses, and to increase the range of data sets to which these algorithms can be applied. In this report, we investigate automated methods for estimating missing data.
A molecular signature of metastasis in primary solid tumors.
- Nat. Genet.
, 2003
"... Metastasis is the principal event leading to death in individuals with cancer, yet its molecular basis is poorly understood 1 . To explore the molecular differences between human primary tumors and metastases, we compared the gene-expression profiles of adenocarcinoma metastases of multiple tumor t ..."
Abstract
-
Cited by 158 (1 self)
- Add to MetaCart
Metastasis is the principal event leading to death in individuals with cancer, yet its molecular basis is poorly understood 1 . To explore the molecular differences between human primary tumors and metastases, we compared the gene-expression profiles of adenocarcinoma metastases of multiple tumor types to unmatched primary adenocarcinomas. We found a geneexpression signature that distinguished primary from metastatic adenocarcinomas. More notably, we found that a subset of primary tumors resembled metastatic tumors with respect to this gene-expression signature. We confirmed this finding by applying the expression signature to data on 279 primary solid tumors of diverse types. We found that solid tumors carrying the gene-expression signature were most likely to be associated with metastasis and poor clinical outcome (P < 0.03). These results suggest that the metastatic potential of human tumors is encoded in the bulk of a primary tumor, thus challenging the notion that metastases arise from rare cells within a primary tumor that have the ability to metastasize 2 . The prevailing model of metastasis holds that most primary tumor cells have low metastatic potential, but rare cells (estimated at less than one in ten million) within large primary tumors acquire metastatic capacity through somatic mutation 2 . The metastatic phenotype includes the ability to migrate from the primary tumor, survive in blood or lymphatic circulation, invade distant tissues and establish distant metastatic nodules. This model is primarily supported by animal models in which poorly metastatic cell lines can spawn highly metastatic variants if the process is facilitated by the isolation of rare metastatic nodules, expansion of the cells in vitro and injection of these selected cells into secondary recipient mice To study the molecular nature of metastasis, we analyzed the gene-expression profiles of 12 metastatic adenocarcinoma nodules of diverse origin (lung, breast, prostate, colorectal, uterus, ovary) and compared them with the expression profiles of 64 primary adenocarcinomas representing the same spectrum of tumor types obtained from different individuals. This comparison identified an expression pattern of 128 genes that best distinguished primary and metastatic adenocarcinomas (Fig.
A Bayesian missing value estimation method for gene expression profile data
- Bioinformatics
, 2003
"... Motivation: Gene expression profile analyses have been used in numerous studies covering a broad range of areas in biology. When unreliable measurements are excluded, missing values are introduced in gene expression profiles. Although existing multivariate analysis methods have difficulty with the t ..."
Abstract
-
Cited by 127 (2 self)
- Add to MetaCart
Motivation: Gene expression profile analyses have been used in numerous studies covering a broad range of areas in biology. When unreliable measurements are excluded, missing values are introduced in gene expression profiles. Although existing multivariate analysis methods have difficulty with the treatment of missing values, this problem has received little attention. There are many options for dealing with missing values, each of which reaches drastically different results. Ignoring missing values is the simplest method and is frequently applied. This approach, however, has its flaws. In this article, we propose an estimation method for missing values, which is based on Bayesian principal component analysis (BPCA). Although the methodology that a probabilistic model and latent variables are estimated simultaneously within the framework of Bayes
et al. Gene expression patterns in human liver cancers
- Mol Biol Cell
"... Hepatocellular carcinoma (HCC) is a leading cause of death worldwide. Using cDNA microarrays to characterize patterns of gene expression in HCC, we found consistent differences between the expression patterns in HCC compared with those seen in nontumor liver tissues. The expression patterns in HCC w ..."
Abstract
-
Cited by 117 (4 self)
- Add to MetaCart
Hepatocellular carcinoma (HCC) is a leading cause of death worldwide. Using cDNA microarrays to characterize patterns of gene expression in HCC, we found consistent differences between the expression patterns in HCC compared with those seen in nontumor liver tissues. The expression patterns in HCC were also readily distinguished from those associated with tumors metastatic to liver. The global gene expression patterns intrinsic to each tumor were sufficiently distinctive that multiple tumor nodules from the same patient could usually be recognized and distinguished from all the others in the large sample set on the basis of their gene expression patterns alone. The distinctive gene expression patterns are characteristic of the tumors and not the patient; the expression programs seen in clonally independent tumor nodules in the same patient were no more similar than those in tumors from different patients. Moreover, clonally related tumor masses that showed distinct expression profiles were also distinguished by genotypic differences. Some features of the gene expression patterns were associated with specific phenotypic and genotypic characteristics of the tumors, including growth rate, vascular invasion, and p53 overexpression.
Missing value estimation for DNA microarray gene expression data: local least squares imputation
- BIOINFORMATICS
, 2005
"... ..."
Genome-wide analysis of estrogen receptor binding sites
, 2006
"... The estrogen receptor is the master transcriptional regulator of breast cancer phenotype and the archetype of a molecular therapeutic target. We mapped all estrogen receptor and RNA polymerase II binding sites on a genome-wide scale, identifying the authentic cis binding sites and target genes, in ..."
Abstract
-
Cited by 82 (12 self)
- Add to MetaCart
The estrogen receptor is the master transcriptional regulator of breast cancer phenotype and the archetype of a molecular therapeutic target. We mapped all estrogen receptor and RNA polymerase II binding sites on a genome-wide scale, identifying the authentic cis binding sites and target genes, in breast cancer cells. Combining this unique resource with gene expression data demonstrates distinct temporal mechanisms of estrogen-mediated gene regulation, particularly in the case of estrogen-suppressed genes. Furthermore, this resource has allowed the identification of cis-regulatory sites in previously unexplored regions of the genome and the cooperating transcription factors underlying estrogen signaling in breast cancer. Recent work has focused on identifying gene expression signatures in breast cancer subtypes that predict response to specific treatment regimes and improved disease outcome Estrogen receptor-mediated transcription has been intensively studied on a small number of endogenous target promoters RESULTS The MCF-7 breast cancer cell line has been extensively used as a model of hormone-dependent breast cancer. We deprived MCF-7 cells of hormones for 3 d and then synchronously induced transcription by the addition of estrogen for a brief period of time (45 min) known to result in maximal estrogen receptor-chromatin binding
Transcriptional gene expression profiles of colorectal adenoma, adenocarcinoma, and normal tissue examined by oligonucleotide arrays
- Cancer Res
, 2001
"... This article cites 20 articles, 8 of which you can access for free at: ..."
Abstract
-
Cited by 57 (3 self)
- Add to MetaCart
(Show Context)
This article cites 20 articles, 8 of which you can access for free at:
Adjustment of systematic microarray data biases
- Bioinformatics
, 2004
"... # corresponding authors ..."
Gene Selection Using Support Vector Machines With Nonconvex Penalty
- Bioinformatics
, 2006
"... Motivation: With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes simultaneously in one single experiment. One current difficulty in interpreting microarray data comes from their innate nature of “high dimensional low sample size.” ..."
Abstract
-
Cited by 53 (2 self)
- Add to MetaCart
Motivation: With the development of DNA microarray technology, scientists can now measure the expression levels of thousands of genes simultaneously in one single experiment. One current difficulty in interpreting microarray data comes from their innate nature of “high dimensional low sample size.” Therefore, robust and accurate gene selection methods are required to identify differentially expressed group of genes across different samples, e.g., between cancerous and normal cells. Successful gene selection will help to classify different cancer types, lead to a better understanding of genetic signatures in cancers, and improve treatment strategies. Although gene selection and cancer classification are two closely related problems, most existing approaches handle them separately by selecting genes prior to classification. We provide
Identifying distinct classes of bladder carcinoma using microarrays,
- Nature Genetics
, 2003
"... Bladder cancer is a common malignant disease characterized by frequent recurrences 1,2 . The stage of disease at diagnosis and the presence of surrounding carcinoma in situ are important in determining the disease course of an affected individual 3 . Despite considerable effort, no accepted immunoh ..."
Abstract
-
Cited by 47 (1 self)
- Add to MetaCart
Bladder cancer is a common malignant disease characterized by frequent recurrences 1,2 . The stage of disease at diagnosis and the presence of surrounding carcinoma in situ are important in determining the disease course of an affected individual 3 . Despite considerable effort, no accepted immunohistological or molecular markers have been identified to define clinically relevant subsets of bladder cancer. Here we report the identification of clinically relevant subclasses of bladder carcinoma using expression microarray analysis of 40 well characterized bladder tumors. Hierarchical cluster analysis identified three major stages, Ta, T1 and T2-4, with the Ta tumors further classified into subgroups. We built a 32-gene molecular classifier using a cross-validation approach that was able to classify benign and muscle-invasive tumors with close correlation to pathological staging in an independent test set of 68 tumors. The classifier provided new predictive information on disease progression in Ta tumors compared with conventional staging (P < 0.005). To delineate non-recurring Ta tumors from frequently recurring Ta tumors, we analyzed expression patterns in 31 tumors by applying a supervised learning classification methodology, which classified 75% of the samples correctly (P < 0.006). Furthermore, gene expression profiles characterizing each stage and subtype identified their biological properties, producing new potential targets for therapy. Parallel gene-expression monitoring is a powerful tool for analyzing relationships between tumors, discovering new tumor subgroups, assigning tumors to pre-defined classes, identifying co-regulated or tumor stage-specific genes and predicting disease outcome To reduce the number of genes needed for class prediction, we identified the 88 genes that varied most between tumor samples (s.d. ≥ 4) and that were considered to be cancer-related by the Cancer Genome Anatomy Project (CGAP; at the US National Cancer Institute). Hierarchical clustering using only these genes was almost identical to that using the 1,767 genes The clustering of the 1,767 genes identified several characteristic profiles that differed between the tumor groups