Results 1 - 10
of
73,931
Implementing data cubes efficiently
- In SIGMOD
, 1996
"... Decision support applications involve complex queries on very large databases. Since response times should be small, query optimization is critical. Users typically view the data as multidimensional data cubes. Each cell of the data cube is a view consisting of an aggregation of interest, like total ..."
Abstract
-
Cited by 548 (1 self)
- Add to MetaCart
Decision support applications involve complex queries on very large databases. Since response times should be small, query optimization is critical. Users typically view the data as multidimensional data cubes. Each cell of the data cube is a view consisting of an aggregation of interest, like
From Data Mining to Knowledge Discovery in Databases.
- AI Magazine,
, 1996
"... ■ Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media attention of late. What is all the excitement about? This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in database ..."
Abstract
-
Cited by 538 (0 self)
- Add to MetaCart
in databases are related both to each other and to related fields, such as machine learning, statistics, and databases. The article mentions particular real-world applications, specific data-mining techniques, challenges involved in real-world applications of knowledge discovery, and current and future
Exploration, normalization, and summaries of high density oligonucleotide array probe level data.
- Biostatistics,
, 2003
"... SUMMARY In this paper we report exploratory analyses of high-density oligonucleotide array data from the Affymetrix GeneChip R system with the objective of improving upon currently used measures of gene expression. Our analyses make use of three data sets: a small experimental study consisting of f ..."
Abstract
-
Cited by 854 (33 self)
- Add to MetaCart
of five MGU74A mouse GeneChip R arrays, part of the data from an extensive spike-in study conducted by Gene Logic and Wyeth's Genetics Institute involving 95 HG-U95A human GeneChip R arrays; and part of a dilution study conducted by Gene Logic involving 75 HG-U95A GeneChip R arrays. We display some
Wide-area cooperative storage with CFS
, 2001
"... The Cooperative File System (CFS) is a new peer-to-peer readonly storage system that provides provable guarantees for the efficiency, robustness, and load-balance of file storage and retrieval. CFS does this with a completely decentralized architecture that can scale to large systems. CFS servers pr ..."
Abstract
-
Cited by 999 (53 self)
- Add to MetaCart
that CFS is scalable: with 4,096 servers, looking up a block of data involves contacting only seven servers. The tests also demonstrate nearly perfect robustness and unimpaired performance even when as many as half the servers fail.
A Security Architecture for Computational Grids
, 1998
"... State-of-the-art and emerging scientific applications require fast access to large quantities of data and commensurately fast computational resources. Both resources and data are often distributed in a wide-area network with components administered locally and independently. Computations may involve ..."
Abstract
-
Cited by 568 (47 self)
- Add to MetaCart
State-of-the-art and emerging scientific applications require fast access to large quantities of data and commensurately fast computational resources. Both resources and data are often distributed in a wide-area network with components administered locally and independently. Computations may
Multivariate adaptive regression splines
- The Annals of Statistics
, 1991
"... A new method is presented for flexible regression modeling of high dimensional data. The model takes the form of an expansion in product spline basis functions, where the number of basis functions as well as the parameters associated with each one (product degree and knot locations) are automaticall ..."
Abstract
-
Cited by 700 (2 self)
- Add to MetaCart
A new method is presented for flexible regression modeling of high dimensional data. The model takes the form of an expansion in product spline basis functions, where the number of basis functions as well as the parameters associated with each one (product degree and knot locations
Base-calling of automated sequencer traces using phred. I. Accuracy Assessment
- GENOME RES
, 1998
"... The availability of massive amounts of DNA sequence information has begun to revolutionize the practice of biology. As a result, current large-scale sequencing output, while impressive, is not adequate to keep pace with growing demand and, in particular, is far short of what will be required to obta ..."
Abstract
-
Cited by 1653 (4 self)
- Add to MetaCart
to obtain the 3-billion-base human genome sequence by the target date of 2005. To reach this goal, improved automation will be essential, and it is particularly important that human involvement in sequence data processing be significantly reduced or eliminated. Progress in this respect will require both
A Sequential Algorithm for Training Text Classifiers
, 1994
"... The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers was ..."
Abstract
-
Cited by 631 (10 self)
- Add to MetaCart
The ability to cheaply train text classifiers is critical to their use in information retrieval, content analysis, natural language processing, and other tasks involving data which is partly or fully textual. An algorithm for sequential sampling during machine learning of statistical classifiers
A Simple Estimator of Cointegrating Vectors in Higher Order Cointegrated Systems
- ECONOMETRICA
, 1993
"... Efficient estimators of cointegrating vectors are presented for systems involving deterministic components and variables of differing, higher orders of integration. The estimators are computed using GLS or OLS, and Wald Statistics constructed from these estimators have asymptotic x2 distributions. T ..."
Abstract
-
Cited by 524 (3 self)
- Add to MetaCart
Efficient estimators of cointegrating vectors are presented for systems involving deterministic components and variables of differing, higher orders of integration. The estimators are computed using GLS or OLS, and Wald Statistics constructed from these estimators have asymptotic x2 distributions
Learning in graphical models
- STATISTICAL SCIENCE
, 2004
"... Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve large-scale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology for ..."
Abstract
-
Cited by 806 (10 self)
- Add to MetaCart
Statistical applications in fields such as bioinformatics, information retrieval, speech processing, image processing and communications often involve large-scale models in which thousands or millions of random variables are linked in complex ways. Graphical models provide a general methodology
Results 1 - 10
of
73,931