Results 1  10
of
84
Parallel Computation of Multivariate Normal Probabilities
"... We present methods for the computation of multivariate normal probabilities on parallel/ distributed systems. After a transformation of the initial integral, an approximation can be obtained using MonteCarlo or quasirandom methods. We propose a metaalgorithm for asynchronous sampling methods and d ..."
Abstract

Cited by 207 (9 self)
 Add to MetaCart
We present methods for the computation of multivariate normal probabilities on parallel/ distributed systems. After a transformation of the initial integral, an approximation can be obtained using MonteCarlo or quasirandom methods. We propose a metaalgorithm for asynchronous sampling methods and derive efficient parallel algorithms for the computation of MVN distribution functions, including a method based on randomized Korobov and Richtmyer sequences. Timing results of the implementations using the MPI parallel environment are given. 1 Introduction The computation of the multivariate normal distribution function F (a; b) = j\Sigmaj \Gamma 1 2 (2) \Gamma n 2 Z b a e \Gamma 1 2 x \Sigma \Gamma1 x dx: (1) often leads to computationalintensive integration problems. Here \Sigma is an n \Theta n symmetric positive definite covariance matrix; furthermore one of the limits in each integration variable may be infinite. Genz [5] performs a sequence of transformations resu...
On the likelihood function of Gaussian maxstable processes indexed by R d, d ≥ 1
, 2010
"... We derive a closed form expression of the likelihood function of a Gaussian maxstable process indexed by Rd at p ≤ d + 1 sites, d ≥ 1. We demonstrate the gain in efficiency in the maximum composite likelihood estimates from p = 2 to p = 3 sites in R2 by means of a simulation study. ..."
Abstract

Cited by 21 (1 self)
 Add to MetaCart
We derive a closed form expression of the likelihood function of a Gaussian maxstable process indexed by Rd at p ≤ d + 1 sites, d ≥ 1. We demonstrate the gain in efficiency in the maximum composite likelihood estimates from p = 2 to p = 3 sites in R2 by means of a simulation study.
Joint sampling distribution between actual and estimated classification errors for linear discriminant analysis
 IEEE Trans. Inf. Theory
, 2010
"... Abstract—Error estimation must be used to find the accuracy of a designed classifier, an issue that is critical in biomarker discovery for disease diagnosis and prognosis in genomics and proteomics. This paper presents, for what is believed to be the first time, the analytical formulation for the jo ..."
Abstract

Cited by 18 (9 self)
 Add to MetaCart
(Show Context)
Abstract—Error estimation must be used to find the accuracy of a designed classifier, an issue that is critical in biomarker discovery for disease diagnosis and prognosis in genomics and proteomics. This paper presents, for what is believed to be the first time, the analytical formulation for the joint sampling distribution of the actual and estimated errors of a classification rule. The analysis presented here concerns the linear discriminant analysis (LDA) classification rule and the resubstitution and leaveoneout error estimators, under a general parametric Gaussian assumption. Exact results are provided in the univariate case, and a simple method is suggested to obtain an accurate approximation in the multivariate case. It is also shown how these results can be applied in the computation of condition bounds and the regression of the actual error, given the observed error estimate. In contrast to asymptotic results, the analysis presented here is applicable to finite training data. In particular, it applies in the smallsample settings commonly found in genomics and proteomics applications. Numerical examples, which include parameters estimated from actual microarray data, illustrate the analysis throughout. Index Terms—Classification, crossvalidation, error estimation, leaveoneout, linear discriminant analysis, resubstitution, sampling distribution. I.
What Affects the Accuracy of QuasiMonte Carlo Quadrature?
"... QuasiMonte Carlo quadrature methods have been used for several decades. Their accuracy ranges from excellent to poor, depending on the problem. This article discusses how quasiMonte Carlo quadrature error can be assessed, and what are the factors that influence it. ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
QuasiMonte Carlo quadrature methods have been used for several decades. Their accuracy ranges from excellent to poor, depending on the problem. This article discusses how quasiMonte Carlo quadrature error can be assessed, and what are the factors that influence it.
On the fitting of mixtures of multivariate skew tdistributions via the EM algorithm
, 2011
"... We show how the expectationmaximization (EM) algorithm can be applied exactly for the fitting of mixtures of general multivariate skew t (MST) distributions, eliminating the need for computationally expensive Monte Carlo estimation. Finite mixtures of MST distributions have proven to be useful in m ..."
Abstract

Cited by 11 (1 self)
 Add to MetaCart
We show how the expectationmaximization (EM) algorithm can be applied exactly for the fitting of mixtures of general multivariate skew t (MST) distributions, eliminating the need for computationally expensive Monte Carlo estimation. Finite mixtures of MST distributions have proven to be useful in modelling heterogeneous data with asymmetric and heavy tail behaviour. Recently, they have been exploited as an effective tool for modelling flow cytometric data. However, without restrictions on the the characterizations of the component skew tdistributions, Monte Carlo methods have been used to fit these models. In this paper, we show how the EM algorithm can be implemented for the iterative computation of the maximum likelihood estimates of the model parameters without resorting to Monte Carlo methods for mixtures with unrestricted MST components. The fast calculation of semiinfinite integrals on the Estep of the EM algorithm is effected by noting that they can be put in the form of moments of the truncated multivariate noncentral tdistribution, which subsequently can be expressed in terms of the nontruncated form of the central tdistribution function for which fast algorithms are available. We demonstrate the usefulness of the proposed methodology by some applications to three real data sets. 1
Alternative Sampling Methods for Estimating Multivariate Normal Probabilities
"... We study the performance of alternative sampling methods for estimating multivariate normal probabilities through the GHK simulator. The sampling methods are randomized versions of some quasiMonte Carlo samples (Halton, Niederreiter, NiederreiterXing sequences and lattice points) and some samples ..."
Abstract

Cited by 10 (0 self)
 Add to MetaCart
We study the performance of alternative sampling methods for estimating multivariate normal probabilities through the GHK simulator. The sampling methods are randomized versions of some quasiMonte Carlo samples (Halton, Niederreiter, NiederreiterXing sequences and lattice points) and some samples based on orthogonal arrays (Latin hypercube, orthogonal array and orthogonal array based Latin hypercube samples). In general, these samples turn out to have a better performance than Monte Carlo and antithetic Monte Carlo samples. Improvements over these are large for lowdimensional (4 and 10) cases and still signi…cant for dimensions as large as 50.
EM algorithms for multivariate Gaussian mixture models with truncated and censored data
, 2010
"... We present expectationmaximization(EM) algorithms for fitting multivariate Gaussian mixture models to data that is truncated, censored or truncated and censored. These two types of incomplete measurements are naturally handled together through their relation to the multivariate truncated Gaussian d ..."
Abstract

Cited by 8 (0 self)
 Add to MetaCart
We present expectationmaximization(EM) algorithms for fitting multivariate Gaussian mixture models to data that is truncated, censored or truncated and censored. These two types of incomplete measurements are naturally handled together through their relation to the multivariate truncated Gaussian distribution. We illustrate our algorithms on synthetic and flow cytometry data.
Modeling dependence using skew t copulas: Bayesian inference and applications
 Journal of Applied Econometrics
, 2012
"... We construct a copula from the skew t distribution of Sahu, Dey & Branco (2003). This copula can capture asymmetric and extreme dependence between variables, and is one of the few copulas that can do so and still be used in high dimensions effectively. However, it is difficult to estimate the co ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
We construct a copula from the skew t distribution of Sahu, Dey & Branco (2003). This copula can capture asymmetric and extreme dependence between variables, and is one of the few copulas that can do so and still be used in high dimensions effectively. However, it is difficult to estimate the copula model by maximum likelihood when the multivariate dimension is high, or when some or all of the marginal distributions are discretevalued, or when the parameters in the marginal distributions and copula are estimated jointly. We therefore propose a Bayesian approach that overcomes all these problems. The computations are undertaken using a Markov chain Monte Carlo simulation method which exploits the conditionally Gaussian representation of the skew t distribution. We employ the approach in two contemporary econometric studies. The first is the modeling of regional spot prices in the Australian electricity market. Here, we observe complex nonGaussian margins and nonlinear interregional dependence. Accurate characterization of this dependence is important for the study of market integration and risk management purposes. The second is the modeling of ordinal exposure measures for 15 major websites. Dependence between websites is important
New Families of Copulas Based on Periodic Functions
 COMMUNICATIONS IN STATISTICS: THEORY AND METHODS&QUOT;, VOL. 34, N O 7
, 2005
"... Although there exists a large variety of copula functions, only a few are practically manageable, and often the choice in dependence modeling falls on the Gaussian copula. Further, most copulas are exchangeable, thus implying symmetric dependence. We introduce a way to construct copulas based on per ..."
Abstract

Cited by 6 (3 self)
 Add to MetaCart
Although there exists a large variety of copula functions, only a few are practically manageable, and often the choice in dependence modeling falls on the Gaussian copula. Further, most copulas are exchangeable, thus implying symmetric dependence. We introduce a way to construct copulas based on periodic functions. We study the twodimensional case based on one dependence parameter and then provide a way to extend the construction to the ndimensional framework. We can thus construct families of copulas in dimension n and parameterized by n − 1 parameters, implying possibly asymmetric relations. Such “periodic” copulas can be simulated easily.
Pareto optimal linear classification
 in Proc. ICML, 2006
, 1990
"... We consider the problem of choosing a linear classifier that minimizes misclassification probabilities in twoclass classification, which is a bicriterion problem, involving a tradeoff between two objectives. We assume that the classconditional distributions are Gaussian. This assumption makes it ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
(Show Context)
We consider the problem of choosing a linear classifier that minimizes misclassification probabilities in twoclass classification, which is a bicriterion problem, involving a tradeoff between two objectives. We assume that the classconditional distributions are Gaussian. This assumption makes it computationally tractable to find Pareto optimal linear classifiers whose classification capabilities are inferior to no other linear ones. The main purpose of this paper is to establish several robustness properties of those classifiers with respect to variations and uncertainties in the distributions. We also extend the results to kernelbased classification. Finally, we show how to carry out tradeoff analysis empirically with a finite number of given labeled data. 1.