Results 11  20
of
477
A new approach to analyzing gene expression time series data
, 2002
"... 1 Introduction Principled methods for estimating unobserved timepoints,clustering, and aligning microarray gene expression timeseries are needed to make such data useful for detailed analysis. Datasets measuring temporal behavior of thousands of genes offer rich opportunities for computational bio ..."
Abstract

Cited by 91 (5 self)
 Add to MetaCart
1 Introduction Principled methods for estimating unobserved timepoints,clustering, and aligning microarray gene expression timeseries are needed to make such data useful for detailed analysis. Datasets measuring temporal behavior of thousands of genes offer rich opportunities for computational biologists. For example, Dynamic Bayesian Networks may be usedto build models and try to understand how genetic responses unfold. However, such modeling frameworks need a sufficient quantity of data in the appropriate format. Current gene expression timeseries data often do not meet these requirements, since they may be missing data points, sampled nonuniformly, and measure biological processes that exhibittemporal variation.
Tensor Completion for Estimating Missing Values in Visual Data
"... In this paper we propose an algorithm to estimate missing values in tensors of visual data. The values can be missing due to problems in the acquisition process, or because the user manually identified unwanted outliers. Our algorithm works even with a small amount of samples and it can propagate st ..."
Abstract

Cited by 84 (4 self)
 Add to MetaCart
(Show Context)
In this paper we propose an algorithm to estimate missing values in tensors of visual data. The values can be missing due to problems in the acquisition process, or because the user manually identified unwanted outliers. Our algorithm works even with a small amount of samples and it can propagate structure to fill larger missing regions. Our methodology is built on recent studies about matrix completion using the matrix trace norm. The contribution of our paper is to extend the matrix case to the tensor case by laying out the theoretical foundations and then by building a working algorithm. First, we propose a definition for the tensor trace norm, that generalizes the established definition of the matrix trace norm. Second, similar to matrix completion, the tensor completion is formulated as a convex optimization problem. Unfortunately, the straightforward problem extension is significantly harder to solve than the matrix case because of the dependency among multiple constraints. To tackle this problem, we employ a relaxation technique to separate the dependant relationships and use the block coordinate descent (BCD) method to achieve a globally optimal solution. Our experiments show potential applications of our algorithm and the quantitative evaluation indicates that our method is more accurate and robust than heuristic approaches. 1.
Learning with Matrix Factorization
, 2004
"... Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning ..."
Abstract

Cited by 71 (6 self)
 Add to MetaCart
(Show Context)
Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or highdimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning for over a century, with many new formulations and models suggested in recent
CA: The Stanford Microarray Database: implementation of new analysis tools and open source release of software
 Matese JC, Nitzberg M, Wymore F, Zachariah ZK, Brown PO, Sherlock G, Ball
"... doi:10.1093/nar/gkl1019 ..."
(Show Context)
Statistical strategies for avoiding false discoveries in metabolomics and related experiments
, 2006
"... Many metabolomics, and other highcontent or highthroughput, experiments are set up such that the primary aim is the discovery of biomarker metabolites that can discriminate, with a certain level of certainty, between nominally matched ‘case ’ and ‘control ’ samples. However, it is unfortunately ve ..."
Abstract

Cited by 61 (11 self)
 Add to MetaCart
(Show Context)
Many metabolomics, and other highcontent or highthroughput, experiments are set up such that the primary aim is the discovery of biomarker metabolites that can discriminate, with a certain level of certainty, between nominally matched ‘case ’ and ‘control ’ samples. However, it is unfortunately very easy to find markers that are apparently persuasive but that are in fact entirely spurious, and there are wellknown examples in the proteomics literature. The main types of danger are not entirely independent of each other, but include bias, inadequate sample size (especially relative to the number of metabolite variables and to the required statistical power to prove that a biomarker is discriminant), excessive false discovery rate due to multiple hypothesis testing, inappropriate choice of particular numerical methods, and overfitting (generally caused by the failure to perform adequate validation and crossvalidation). Many studies fail to take these into account, and thereby fail to discover anything of true significance (despite their claims). We summarise these problems, and provide pointers to a substantial existing literature that should assist in the improved design and evaluation of metabolomics experiments, thereby allowing robust scientific conclusions to be drawn from the available data. We provide a list of some of the simpler checks that might improve one’s confidence that a candidate biomarker is not simply a statistical artefact, and suggest a series of preferred tests and visualisation tools that can assist readers and authors in assessing papers. These tools can be applied to individual metabolites by using multiple univariate tests performed in parallel across all metabolite peaks. They may also be applied to the validation of multivariate models. We stress in
Dfs: A file system for virtualized flash storage
 In FAST’10: Proc. of the Eighth USENIX Conf. on File and Storage Technologies (2010), USENIX Association
"... This paper presents the design, implementation and evaluation of Direct File System (DFS) for virtualized flash storage. Instead of using traditional layers of abstraction, our layers of abstraction are designed for directly accessing flash memory devices. DFS has two main novel features. First, it ..."
Abstract

Cited by 57 (2 self)
 Add to MetaCart
(Show Context)
This paper presents the design, implementation and evaluation of Direct File System (DFS) for virtualized flash storage. Instead of using traditional layers of abstraction, our layers of abstraction are designed for directly accessing flash memory devices. DFS has two main novel features. First, it lays out its files directly in a very large virtual storage address space provided by FusionIO’s virtual flash storage layer. Second, it leverages the virtual flash storage layer to perform block allocations and atomic updates. As a result, DFS performs better and it is much simpler than a traditional Unix file system with similar functionalities. Our microbenchmark results show that DFS can deliver 94,000 I/O operations per second (IOPS) for direct reads and 71,000 IOPS for direct writes with the virtualized flash storage layer on FusionIO’s ioDrive. For direct access performance, DFS is consistently better than ext3 on the same platform, sometimes by 20%. For buffered access performance, DFS is also consistently better than ext3, and sometimes by over 149%. Our application benchmarks show that DFS outperforms ext3 by 7% to 250 % while requiring less CPU power. 1
Adjustment of systematic microarray data biases
 Bioinformatics
, 2004
"... # corresponding authors ..."
Expression Profiler: next generation–an online platform for analysis of microarray data
 Nucleic Acids Res
, 2004
"... Expression Profiler (EP, ..."
(Show Context)