Improving the performance of predictive process modeling for large datasets. Academic Article uri icon


  • Advances in Geographical Information Systems (GIS) and Global Positioning Systems (GPS) enable accurate geocoding of locations where scientific data are collected. This has encouraged collection of large spatial datasets in many fields and has generated considerable interest in statistical modeling for location-referenced spatial data. The setting where the number of locations yielding observations is too large to fit the desired hierarchical spatial random effects models using Markov chain Monte Carlo methods is considered. This problem is exacerbated in spatial-temporal and multivariate settings where many observations occur at each location. The recently proposed predictive process, motivated by kriging ideas, aims to maintain the richness of desired hierarchical spatial modeling specifications in the presence of large datasets. A shortcoming of the original formulation of the predictive process is that it induces a positive bias in the non-spatial error term of the models. A modified predictive process is proposed to address this problem. The predictive process approach is knot-based leading to questions regarding knot design. An algorithm is designed to achieve approximately optimal spatial placement of knots. Detailed illustrations of the modified predictive process using multivariate spatial regression with both a simulated and a real dataset are offered.

published proceedings

  • Comput Stat Data Anal

altmetric score

  • 9

author list (cited authors)

  • Finley, A. O., Sang, H., Banerjee, S., & Gelfand, A. E.

citation count

  • 203

complete list of authors

  • Finley, Andrew O||Sang, Huiyan||Banerjee, Sudipto||Gelfand, Alan E

publication date

  • January 2009