Results 1  10
of
127
Nonparametric Mixed Effects Models for Unequally Sampled Noisy Curves
 Biometrics
, 1998
"... We propose a method of analyzing collections of related curves in which the individual curves are modeled as spline functions with random coefficients. The method is applicable when the individual curves are sampled at variable and irregularly spaced points. This produces a low rank, low frequency a ..."
Abstract

Cited by 114 (3 self)
 Add to MetaCart
We propose a method of analyzing collections of related curves in which the individual curves are modeled as spline functions with random coefficients. The method is applicable when the individual curves are sampled at variable and irregularly spaced points. This produces a low rank, low frequency approximation to the covariance structure, which can be estimated naturally by the EM algorithm. Smooth curves for individual trajectories are constructed as BLUP estimates, combining data from that individual and the entire collection. This framework leads naturally to methods for examining the effects of covariates on the shapes of the curves. We use model selection techniquesAIC, BIC, and crossvalidation to select the number of breakpoints for the spline approximation. We believe that the methodology we propose provides a simple, flexible, and computationally efficient means of functional data analysis. We illustrate it with two sets of data. 1 Introduction In recent years there ha...
Generalized functional linear models
 Ann. Statist
, 2005
"... We propose a generalized functional linear regression model for a regression situation where the response variable is a scalar and the predictor is a random function. A linear predictor is obtained by forming the scalar product of the predictor function with a smooth parameter function, and the expe ..."
Abstract

Cited by 97 (9 self)
 Add to MetaCart
(Show Context)
We propose a generalized functional linear regression model for a regression situation where the response variable is a scalar and the predictor is a random function. A linear predictor is obtained by forming the scalar product of the predictor function with a smooth parameter function, and the expected value of the response is related to this linear predictor via a link function. If in addition a variance function is specified, this leads to a functional estimating equation which corresponds to maximizing a functional quasilikelihood. This general approach includes the special cases of the functional linear model, as well as functional Poisson regression and functional binomial regression. The latter leads to procedures for classification and discrimination of stochastic processes and functional data. We also consider the situation where the link and variance functions are unknown and are estimated nonparametrically from the data, using a semiparametric quasilikelihood procedure. An essential step in our proposal is dimension reduction by approximating the predictor processes with a truncated KarhunenLoève expansion. We develop asymptotic inference for the proposed class of generalized regression models. In the proposed asymptotic approach, the truncation parameter increases with sample size, and a martingale central limit theorem is applied to establish the resulting increasing dimension asymptotics. We establish asymptotic normality for a properly scaled distance
Continuous representations of timeseries gene expression data
 J COMPUT BIOL
, 2003
"... We present algorithms for timeseries gene expression analysis that permit the principled estimation of unobserved time points, clustering, and dataset alignment. Each expression profile is modeled as a cubic spline (piecewise polynomial) that is estimated from the observed data and every time point ..."
Abstract

Cited by 96 (11 self)
 Add to MetaCart
(Show Context)
We present algorithms for timeseries gene expression analysis that permit the principled estimation of unobserved time points, clustering, and dataset alignment. Each expression profile is modeled as a cubic spline (piecewise polynomial) that is estimated from the observed data and every time point influences the overall smooth expression curve. We constrain the spline coefficients of genes in the same class to have similar expression patterns, while also allowing for gene specific parameters. We show that unobserved time points can be reconstructed using our method with 10–15 % less error when compared to previous best methods. Our clustering algorithm operates directly on the continuous representations of gene expression profiles, and we demonstrate that this is particularly effective when applied to nonuniformly sampled data. Our continuous alignment algorithm also avoids difficulties encountered by discrete approaches. In particular, our method allows for control of the number of degrees of freedom of the warp through the specification of parameterized functions, which helps to avoid overfitting. We demonstrate that our algorithm produces stable lowerror alignments on real expression data and further show a specific application to yeast knockout data that produces biologically meaningful results.
Inference in Generalized Additive Mixed Models Using Smoothing Splines
, 1999
"... this paper, we propose generalized additive mixed models (GAMMs), which are an additive extension of generalized linear mixed models in the spirit of Hastie and Tibshirani (1990). This new class of models uses additive nonparametric functions to model covariate effects while accounting for overdispe ..."
Abstract

Cited by 86 (7 self)
 Add to MetaCart
this paper, we propose generalized additive mixed models (GAMMs), which are an additive extension of generalized linear mixed models in the spirit of Hastie and Tibshirani (1990). This new class of models uses additive nonparametric functions to model covariate effects while accounting for overdispersion and correlation by adding random effects to the additive predictor. GAMMs encompass nested and crossed designs and are applicable to clustered, hierarchical and spatial data. We estimate the nonparametric functions using smoothing splines, and jointly estimate the smoothing parameters and the variance components using marginal quasilikelihood. This marginal quasilikelihood approach is an extension of the restricted maximum likelihood approach used by Wahba (1985) and Kohn, et al. (1991) in the classical nonparametric regression model (Kohn, et al. 1991, eq 2.1), and by Zhang, et al. (1998) in Gaussian nonparametric mixed models, where they treated the smoothing parameter as an extra variance component. In view of numerical integration often required by maximizing the objective functions, double penalized quasilikelihood (DPQL) is proposed to make approximate inference. Frequentist and Bayesian inferences are compared. A key feature of the proposed method is that it allows us to make systematic inference on all model components of GAMMs within a unified parametric mixed model framework. Specifically, our estimation of the nonparametric functions, the smoothing parameters and the variance components in GAMMs can proceed by fitting a working GLMM using existing statistical software, which iteratively fits a linear mixed model to a modified dependent variable. When the data are sparse (e.g., binary), the DPQL estimators of the variance components are found to be subject t...
Waveletbased functional mixed models
 Journal of the Royal Statistical Society, Series B
, 2006
"... Summary. Increasingly, scientific studies yield functional data, in which the ideal units of observation are curves and the observed data consist of sets of curves that are sampled on a fine grid. We present new methodology that generalizes the linear mixed model to the functional mixed model framew ..."
Abstract

Cited by 85 (16 self)
 Add to MetaCart
(Show Context)
Summary. Increasingly, scientific studies yield functional data, in which the ideal units of observation are curves and the observed data consist of sets of curves that are sampled on a fine grid. We present new methodology that generalizes the linear mixed model to the functional mixed model framework, with model fitting done by using a Bayesian waveletbased approach. This method is flexible, allowing functions of arbitrary form and the full range of fixed effects structures and betweencurve covariance structures that are available in the mixed model framework. It yields nonparametric estimates of the fixed and randomeffects functions as well as the various betweencurve and withincurve covariance matrices.The functional fixed effects are adaptively regularized as a result of the nonlinear shrinkage prior that is imposed on the fixed effects’ wavelet coefficients, and the randomeffect functions experience a form of adaptive regularization because of the separately estimated variance components for each wavelet coefficient. Because we have posterior samples for all model quantities, we can perform pointwise or joint Bayesian inference or prediction on the quantities of the model.The adaptiveness of the method makes it especially appropriate for modelling irregular functional data that are characterized by numerous local features like peaks.
Functionalcoefficient Regression Models for Nonlinear Time Series
 Journal of the American Statistical Association
, 1998
"... We apply the local linear regression technique for estimation of functionalcoefficient regression models for time series data. The models include threshold autoregressive models (Tong 1990) and functionalcoefficient autoregressive models (Chen and Tsay 1993) as special cases but with the added adv ..."
Abstract

Cited by 82 (15 self)
 Add to MetaCart
We apply the local linear regression technique for estimation of functionalcoefficient regression models for time series data. The models include threshold autoregressive models (Tong 1990) and functionalcoefficient autoregressive models (Chen and Tsay 1993) as special cases but with the added advantages such as depicting finer structure of the underlying dynamics and better postsample forecasting performance. We have also proposed a new bootstrap test for the goodness of fit of models and a bandwidth selector based on newly defined crossvalidatory estimation for the expected forecasting errors. The proposed methodology is dataanalytic and is of appreciable flexibility to analyze complex and multivariate nonlinear structures without suffering from the "curse of dimensionality". The asymptotic properties of the proposed estimators are investigated under the ffmixing condition. Both simulated and real data examples are used for illustration. Key Words: ffmixing; Asymptotic normali...
Properties of principal component methods for functional and longitudinal data analysis
 Ann. Statist
, 2006
"... The use of principal component methods to analyze functional data is appropriate in a wide range of different settings. In studies of “functional data analysis, ” it has often been assumed that a sample of random functions is observed precisely, in the continuum and without noise. While this has bee ..."
Abstract

Cited by 73 (5 self)
 Add to MetaCart
(Show Context)
The use of principal component methods to analyze functional data is appropriate in a wide range of different settings. In studies of “functional data analysis, ” it has often been assumed that a sample of random functions is observed precisely, in the continuum and without noise. While this has been the traditional setting for functional data analysis, in the context of longitudinal data analysis a random function typically represents a patient, or subject, who is observed at only a small number of randomly distributed points, with nonnegligible measurement error. Nevertheless, essentially the same methods can be used in both these cases, as well as in the vast number of settings that lie between them. How is performance affected by the sampling plan? In this paper we answer that question. We show that if there is a sample of n functions, or subjects, then estimation of eigenvalues is a semiparametric problem, with rootn consistent estimators, even if only a few observations are made of each function,
Prediction in functional linear regression.
 Ann. Statist.
, 2006
"... ABSTRACT. There has been substantial recent work on methods for estimating the slope function in linear regression for functional data analysis. However, as in the case of more conventional, finitedimensional regression, much of the practical interest in the slope centers on its application for th ..."
Abstract

Cited by 71 (5 self)
 Add to MetaCart
(Show Context)
ABSTRACT. There has been substantial recent work on methods for estimating the slope function in linear regression for functional data analysis. However, as in the case of more conventional, finitedimensional regression, much of the practical interest in the slope centers on its application for the purpose of prediction, rather than on its significance in its own right. We show that the problems of slopefunction estimation, and of prediction from an estimator of the slope function, have very different characteristics. While the former is intrinsically nonparametric, the latter can be either nonparametric or semiparametric. In particular, the optimal meansquare convergence rate of predictors is n −1 , where n denotes sample size, if the predictand is a sufficiently smooth function. In other cases, convergence occurs at a polynomial rate that is strictly slower than n −1 . At the boundary between these two regimes, the meansquare convergence rate is less than n −1 by only a logarithmic factor. More generally, the rate of convergence of the predicted value of the mean response in the regression model, given a particular value of the explanatory variable, is determined by a subtle interaction among the smoothness of the predictand, of the slope function in the model, and of the autocovariance function for the distribution of explanatory variables.
Nonparametric function estimation for clustered data when the predictor is measured without/with error,
 Journal of the American Statistical Association,
, 2000
"... Abstract We consider local polynomial kernel regression with a single covariate for clustered data using estimating equations. We assume that at most m < ∞ observations are available on each cluster. In the case of random regressors, with no measurement error in the predictor, we show that it is ..."
Abstract

Cited by 70 (16 self)
 Add to MetaCart
(Show Context)
Abstract We consider local polynomial kernel regression with a single covariate for clustered data using estimating equations. We assume that at most m < ∞ observations are available on each cluster. In the case of random regressors, with no measurement error in the predictor, we show that it is generally the best strategy to ignore entirely the correlation structure within each cluster, and instead to pretend that all observations are independent. In the further special case of longitudinal data on individuals with fixed common observation times, we show that equivalent to the pooled data approach is the strategy of fitting separate nonparametric regressions at each observation time and constructing an optimal weighted average. We also consider what happens when the predictor is measured with error. Using the SIMEX approach to correct for measurement error, we construct an asymptotic theory for both the pooled and weighted average estimators. Surprisingly, for the same amount of smoothing, the weighted average estimators typically have smaller variances than the pooling strategy. We apply the proposed methods to the analysis of the AIDS Costs and Services Utilization Survey. Nonparametric Function Estimation for Clustered Data When the Predictor is Measured Without/With Error Abstract We consider local polynomial kernel regression with a single covariate for clustered data using estimating equations. We assume that at most m < ∞ observations are available on each cluster. In the case of random regressors, with no measurement error in the predictor, we show that it is generally the best strategy to ignore entirely the correlation structure within each cluster, and instead to pretend that all observations are independent. In the further special case of longitudinal data on individuals with fixed common observation times, we show that equivalent to the pooled data approach is the strategy of fitting separate nonparametric regressions at each observation time and constructing an optimal weighted average. We also consider what happens when the predictor is measured with error. Using the SIMEX approach to correct for measurement error, we construct an asymptotic theory for both the pooled and weighted average estimators. Surprisingly, for the same amount of smoothing, the weighted average estimators typically have smaller variances than the pooling strategy. We apply the proposed methods to the analysis of the AIDS Costs and Services Utilization Survey.