Semiparametric Regression for Clustered Data Using Generalized Estimating Equations Academic Article uri icon

abstract

  • We consider estimation in a semiparametric generalized linear model for clustered data using estimating equations. Our results apply to the case where the number of observations per cluster is finite, whereas the number of clusters is large. The mean of the outcome variable is of the form g() = XT + (T), where g() is a link function, X and T are covariates, is an unknown parameter vector, and (t) is an unknown smooth function. Kernel estimating equations proposed previously in the literature are used to estimate the infinite-dimensional nonparametric function (t), and a profile-based estimating equation is used to estimate the finite-dimensional parameter vector . We show that for clustered data, this conventional profile-kernel method often fails to yield a n-consistent estimator of along with appropriate inference unless working independence is assumed or (t) is artificially undersmoothed, in which case asymptotic inference is possible. To gain insight into these results, we derive the semiparametric efficient score of , which is found to have a complicated form, and show that, unlike for independent data, the profile-kernel method does not yield a score function asymptotically equivalent to the semiparametric efficient score of , even when the true correlation is assumed and (t) is undersmoothed. We illustrate the methods with an application to infectious disease data and evaluate their finite-sample performance through a simulation study. 2001 American Statistical Association.

published proceedings

  • Journal of the American Statistical Association

author list (cited authors)

  • Lin, X., & Carroll, R. J.

citation count

  • 212

complete list of authors

  • Lin, Xihong||Carroll, Raymond J

publication date

  • September 2001