Investigation of Effects of Underreporting Crash Data on Three Commonly Used Traffic Crash Severity Models Academic Article uri icon


  • Although much work has been devoted to developing crash severity models to predict the probabilities of crashes for different severity levels, few studies have considered the underreporting issue in the modeling process. Inferences about a population of interest are biased if crash data are treated as a random sample from the population without consideration of the different unreported rates for each crash severity level. The primary objective of this study was to examine the effects of underreporting for three commonly used traffic crash severity models: multinomial logit (MNL), ordered probit (OP), and mixed logit (ML) models. The objective was accomplished with a Monte Carlo approach that used simulated and observed crash data. The results showed that, to minimize the bias and reduce the variability of a model, fatal crashes should be set as the baseline severity for the MNL and ML models, while for the OP models, the rank for the crash severity should be set from fatal to property damage only in a descending order. None of the three models was immune to this underreporting issue. When full or partial information about the unreported rates for each severity level was known, treatment of the crash data as outcome-based samples in model estimation (through the weighted exogenous sample maximum likelihood estimator) dramatically improved the estimation for all three models as compared with the results produced from the maximum likelihood estimator.

published proceedings

  • Transportation Research Record Journal of the Transportation Research Board

author list (cited authors)

  • Ye, F., & Lord, D.

citation count

  • 107

complete list of authors

  • Ye, Fan||Lord, Dominique

publication date

  • January 2011