Results 1 - 10
of
36
A sequential importance sampling algorithm for generating random graphs with prescribed degrees
, 2006
"... Random graphs with a given degree sequence are a useful model capturing several features absent in the classical Erdős-Rényi model, such as dependent edges and non-binomial degrees. In this paper, we use a characterization due to Erdős and Gallai to develop a sequential algorithm for generating a ra ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
Random graphs with a given degree sequence are a useful model capturing several features absent in the classical Erdős-Rényi model, such as dependent edges and non-binomial degrees. In this paper, we use a characterization due to Erdős and Gallai to develop a sequential algorithm for generating a random labeled graph with a given degree sequence. The algorithm is easy to implement and allows surprisingly efficient sequential importance sampling. Applications are given, including simulating a biological network and estimating the number of graphs with a given degree sequence. 1. Introduction. Random
A toolbox for k-centroids cluster analysis
- Computational Statistics and Data Analysis
, 2006
"... A methodological and computational framework for centroid-based partitioning cluster analysis using arbitrary distance or similarity measures is presented. The power of highlevel statistical computing environments like R enables data analysts to easily try out various distance measures with only min ..."
Abstract
-
Cited by 14 (7 self)
- Add to MetaCart
A methodological and computational framework for centroid-based partitioning cluster analysis using arbitrary distance or similarity measures is presented. The power of highlevel statistical computing environments like R enables data analysts to easily try out various distance measures with only minimal programming effort. A new variant of centroid neighborhood graphs is introduced which gives insight into the relationships between adjacent clusters. Artificial examples and a case study from marketing research are used to demonstrate the influence of distances measures on partitions and usage of neighborhood graphs. 1
Time series analysis via mechanistic models. In review; pre-published at arxiv.org/abs/0802.0021
, 2008
"... The purpose of time series analysis via mechanistic models is to reconcile the known or hypothesized structure of a dynamical system with observations collected over time. We develop a framework for constructing nonlinear mechanistic models and carrying out inference. Our framework permits the consi ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
The purpose of time series analysis via mechanistic models is to reconcile the known or hypothesized structure of a dynamical system with observations collected over time. We develop a framework for constructing nonlinear mechanistic models and carrying out inference. Our framework permits the consideration of implicit dynamic models, meaning statistical models for stochastic dynamical systems which are specified by a simulation algorithm to generate sample paths. Inference procedures that operate on implicit models are said to have the plug-and-play property. Our work builds on recently developed plug-and-play inference methodology for partially observed Markov models. We introduce a class of implicitly specified Markov chains with stochastic transition rates, and we demonstrate its applicability to open problems in statistical inference for biological systems. As one example, these models are shown to give a fresh perspective on measles transmission dynamics. As a second example, we present a mechanistic analysis of cholera incidence data, involving interaction between two competing strains of the pathogen Vibrio cholerae. 1. Introduction. A
Random-Set Methods Identify Distinct Aspects of the Enrichment Signal in Gene-Set Analysis,” The Annals of Applied Statistics
, 2007
"... A prespecified set of genes may be enriched, to varying degrees, for genes that have altered expression levels relative to two or more states of a cell. Knowing the enrichment of gene sets defined by functional categories, such as gene ontology (GO) annotations, is valuable for analyzing the biologi ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
A prespecified set of genes may be enriched, to varying degrees, for genes that have altered expression levels relative to two or more states of a cell. Knowing the enrichment of gene sets defined by functional categories, such as gene ontology (GO) annotations, is valuable for analyzing the biological signals in microarray expression data. A common approach to measuring enrichment is by cross-classifying genes according to membership in a functional category and membership on a selected list of significantly altered genes. A small Fisher’s exact test p-value, for example, in this 2 × 2 table is indicative of enrichment. Other category analysis methods retain the quantitative gene-level scores and measure significance by referring a category-level statistic to a permutation distribution associated with the original differential expression problem. We describe a class of random-set scoring methods that measure distinct components of the enrichment signal. The class includes Fisher’s test based on selected genes and also tests that average gene-level evidence across the category. Averaging and selection methods are compared empirically using Affymetrix data on expression in nasopharyngeal cancer tissue, and theoretically using a location model of differential expression. We find that each method has a domain of superiority in the state space of enrichment problems, and that both methods have benefits in practice. Our analysis also addresses two problems related to multiple-category inference, namely, that equally enriched categories are not detected with equal probability if they are of different sizes, and also that there is dependence among category statistics owing to shared genes. Random-set enrichment calculations do not require Monte Carlo for implementation. They are made available in the R package allez.
Dynamic behaviour of connectionist speech recognition with strong latency constraints
- Speech Comm
"... This paper describes the use of connectionist techniques in phonetic speech recognition with strong latency constraints. The constraints are imposed by the task of deriving the lip movements of a synthetic face in real time from the speech signal, by feeding the phonetic string into an articulatory ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
This paper describes the use of connectionist techniques in phonetic speech recognition with strong latency constraints. The constraints are imposed by the task of deriving the lip movements of a synthetic face in real time from the speech signal, by feeding the phonetic string into an articulatory synthesiser. Particular attention has been paid to analysing the interaction between the time evolution model learnt by the multi-layer perceptrons and the transition model imposed by the Viterbi decoder, in different latency conditions. Two experiments were conducted in which the time dependencies in the language model (LM) were controlled by a parameter. The results show a strong interaction between the three factors involved, namely the neural network topology, the length of time dependencies in the LM and the decoder latency. Key words: speech recognition, neural network, low latency, non-linear dynamics 1
Sensitivity in risk analyses with uncertain numbers
, 2006
"... Sensitivity analysis is a study of how changes in the inputs to a model influence the results of the model. Many techniques have recently been proposed for use when the model is probabilistic. This report considers the related problem of sensitivity analysis when the model includes uncertain numbers ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Sensitivity analysis is a study of how changes in the inputs to a model influence the results of the model. Many techniques have recently been proposed for use when the model is probabilistic. This report considers the related problem of sensitivity analysis when the model includes uncertain numbers that can involve both aleatory and epistemic uncertainty and the method of calculation is Dempster-Shafer evidence theory or probability bounds analysis. Some traditional methods for sensitivity analysis generalize directly for use with uncertain numbers, but, in some respects, sensitivity analysis for these analyses differs from traditional deterministic or probabilistic sensitivity analyses. A case study of a dike reliability assessment illustrates several methods of sensitivity analysis, including traditional probabilistic assessment, local derivatives, and a “pinching ” strategy that hypothetically reduces the epistemic uncertainty or aleatory uncertainty, or both, in an input variable to estimate the reduction of uncertainty in the outputs. The prospects for applying the methods to black box models are also considered. 3
SENSITIVITY OF INFERENCES IN FORENSIC GENETICS TO ASSUMPTIONS ABOUT FOUNDING GENES
"... Many forensic genetics problems can be handled using structured systems of discrete variables, for which Bayesian networks offer an appealing practical modeling framework, and allow inferences to be computed by probability propagation methods. However, when standard assumptions are violated—for exam ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Many forensic genetics problems can be handled using structured systems of discrete variables, for which Bayesian networks offer an appealing practical modeling framework, and allow inferences to be computed by probability propagation methods. However, when standard assumptions are violated—for example, when allele frequencies are unknown, there is identity by descent or the population is heterogeneous—dependence is generated among founding genes, that makes exact calculation of conditional probabilities by propagation methods less straightforward. Here we illustrate different methodologies for assessing sensitivity to assumptions about founders in forensic genetics problems. These include constrained steepest descent, linear fractional programming and representing dependence by structure. We illustrate these methods on several forensic genetics examples involving criminal identification, simple and complex disputed paternity and DNA mixtures. 1. Introduction. Forensic
Empirical Acquisition of Conceptual Distinctions via Dictionary Definitions
, 2004
"... This thesis discusses the automatic acquisition of conceptual distinctions using empirical methods, with an emphasis on semantic relations. The goal is to improve semantic lexicons for computational linguistics, but the work can be applied to general-purpose knowledge bases as well. ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
This thesis discusses the automatic acquisition of conceptual distinctions using empirical methods, with an emphasis on semantic relations. The goal is to improve semantic lexicons for computational linguistics, but the work can be applied to general-purpose knowledge bases as well.
Advance Access publication on December 12, 2005 Survival ensembles
"... We propose a unified and flexible framework for ensemble learning in the presence of censoring. For right-censored data, we introduce a random forest algorithm and a generic gradient boosting algorithm for the construction of prognostic and diagnostic models. The methodology is utilized for predicti ..."
Abstract
- Add to MetaCart
We propose a unified and flexible framework for ensemble learning in the presence of censoring. For right-censored data, we introduce a random forest algorithm and a generic gradient boosting algorithm for the construction of prognostic and diagnostic models. The methodology is utilized for predicting the survival time of patients suffering from acute myeloid leukemia based on clinical and genetic covariates. Furthermore, we compare the diagnostic capabilities of the proposed censored data random forest and boosting methods, applied to the recurrence-free survival time of node-positive breast cancer patients, with previously published findings.
Perfect simulation and moment properties for the Matérn Type III Process
, 2009
"... process) is a less well-known but for many applications more appealing or realistic model than the Matérn type I and II hard core point processes. This paper focuses on the stationary (and hence infinite) Matérn III process from a probabilistic and a stochastic geometry perspective. Briefly, given a ..."
Abstract
- Add to MetaCart
process) is a less well-known but for many applications more appealing or realistic model than the Matérn type I and II hard core point processes. This paper focuses on the stationary (and hence infinite) Matérn III process from a probabilistic and a stochastic geometry perspective. Briefly, given a hard core parameter R> 0, the Matérn III process is obtained by a dependent thinning from a spatio-temporal Poisson process on R d ×[0, 1] with intensity λ> 0, where a Poisson point becomes a Matérn III point if the ball of radius R centered at the point does not contain an earlier Matérn III point. Using a construction of Matérn III that creates various ‘generations ’ of points, a perfect simulation algorithm for the infinite Matérn III process within a bounded region is developed. It is shown that the log expected number of points that must be examined is bounded above by a linear function which is easily calculated. This result is quite general, which is illustrated by an extension of the basic Matérn III process to allow random radii or more generally to replace balls with random sets, and also to allow spatial inhomogeneity. The perfect simulation algorithm is used to provide Monte Carlo estimates of the packing density of Matérn III, which can be much higher than for Matérn I or II, and increases to the jamming limit of the random sequential adsorption model as λ → ∞.

