Effects of low sample mean values and small sample size on the estimation of the fixed dispersion parameter of Poisson-gamma models for modeling motor vehicle crashes: A Bayesian perspective Academic Article uri icon


  • There has been considerable research conducted on the development of statistical models for predicting motor vehicle crashes on highway facilities. Over the last few years, there has been a significant increase in the application hierarchical Bayes methods for modeling motor vehicle crash data. Whether the inferences are estimated using classical or Bayesian methods, the most common probabilistic structure used for modeling this type of data remains the traditional Poisson-gamma (or Negative Binomial) model. Crash data collected for highway safety studies often have the unusual attributes of being characterized by low sample mean values and, due to the prohibitive costs of collecting data, small sample sizes. Previous studies have shown that the dispersion parameter of Poisson-gamma models can be seriously mis-estimated when the models are estimated using the maximum likelihood estimation (MLE) method for these extreme conditions. Despite important work done on this topic for the MLE, nobody has so far examined how low sample mean values and small sample sizes affect the posterior mean of the dispersion parameter of Poisson-gamma models estimated using the hierarchical Bayes method. The inverse dispersion parameter plays an important role in various types of highway safety studies. It is therefore vital to determine the conditions in which the inverse dispersion parameter may be mis-estimated for this category of models. To accomplish the objectives of this study, a simulation framework is developed to generate data from the Poisson-gamma distributions using different values describing the mean, the dispersion parameter, the sample size, and the prior specification. Vague and non-vague prior specifications are tested for determining the magnitude of the biases introduced by low sample mean values and small sample sizes. A series of datasets are also simulated from the Poisson-lognormal distributions, in the light of recent work done by statisticians on this mixed distribution. The study shows that a dataset characterized by a low sample mean combined with a small sample size can seriously affect the estimation of the posterior mean of the dispersion parameter when a vague prior specification is used to characterize the gamma hyper-parameter. The risk of a mis-estimated posterior mean can be greatly minimized when an appropriate non-vague prior distribution is used. Finally, the study shows that Poisson-lognormal models are recommended over Poisson-gamma models when assuming vague priors and whenever crash data characterized by low sample mean values are used for developing crash prediction models. © 2007 Elsevier Ltd. All rights reserved.

altmetric score

  • 6

author list (cited authors)

  • Lord, D., & Miranda-Moreno, L. F.

citation count

  • 143

publication date

  • June 2008