Miscellanea Bagging cross-validated bandwidths with application to big data

abstract

Summary Hall & Robinson (2009) proposed and analysed the use of bagged cross-validation to choose the bandwidth of a kernel density estimator. They established that bagging greatly reduces the noise inherent in ordinary cross-validation, and hence leads to a more efficient bandwidth selector. The asymptotic theory of Hall & Robinson (2009) assumes that $N$, the number of bagged subsamples, is $infty$. We expand upon their theoretical results by allowing $N$ to be finite, as it is in practice. Our results indicate an important difference in the rate of convergence of the bagged cross-validation bandwidth for the cases $N=infty$ and $N>infty$. Simulations quantify the improvement in statistical efficiency and computational speed that can result from using bagged cross-validation as opposed to a binned implementation of ordinary cross-validation. The performance of the bagged bandwidth is also illustrated on a real, very large, dataset. Finally, a byproduct of our study is the correction of errors appearing in the Hall & Robinson (2009) expression for the asymptotic mean squared error of the bagging selector.

authors

Hart, Jeffrey

published proceedings

BIOMETRIKA

author list (cited authors)

Barreiro-Ures, D., Cao, R., Francisco-Fernandez, M., & Hart, J. D.

citation count

2

complete list of authors

Barreiro-Ures, D||Cao, R||Francisco-Fernandez, M||Hart, JD

publication date

January 2021

publisher

Oxford University Press (OUP) Publisher

published in

Biometrika Journal

keywords

Bagging
Bandwidth
Big Data
Cross-validation
Kernel Density

Digital Object Identifier (DOI)

10.1093/biomet/asaa092

start page

981

end page

988

volume

108

issue

4

URL

http://dx.doi.org/10.1093/biomet/asaa092

Miscellanea Bagging cross-validated bandwidths with application to big data Academic Article

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL