Miscellanea Bagging cross-validated bandwidths with application to big data Academic Article uri icon

abstract

  • Summary Hall & Robinson (2009) proposed and analysed the use of bagged cross-validation to choose the bandwidth of a kernel density estimator. They established that bagging greatly reduces the noise inherent in ordinary cross-validation, and hence leads to a more efficient bandwidth selector. The asymptotic theory of Hall & Robinson (2009) assumes that $N$, the number of bagged subsamples, is $infty$. We expand upon their theoretical results by allowing $N$ to be finite, as it is in practice. Our results indicate an important difference in the rate of convergence of the bagged cross-validation bandwidth for the cases $N=infty$ and $N>infty$. Simulations quantify the improvement in statistical efficiency and computational speed that can result from using bagged cross-validation as opposed to a binned implementation of ordinary cross-validation. The performance of the bagged bandwidth is also illustrated on a real, very large, dataset. Finally, a byproduct of our study is the correction of errors appearing in the Hall & Robinson (2009) expression for the asymptotic mean squared error of the bagging selector.

published proceedings

  • BIOMETRIKA

author list (cited authors)

  • Barreiro-Ures, D., Cao, R., Francisco-Fernandez, M., & Hart, J. D.

citation count

  • 2

complete list of authors

  • Barreiro-Ures, D||Cao, R||Francisco-Fernandez, M||Hart, JD

publication date

  • January 2021