Generalized Resubstitution for Classification Error Estimation
Institutional Repository Document
We propose the family of generalized resubstitution classifier error estimators based on empirical measures. These error estimators are computationally efficient and do not require re-training of classifiers. The plain resubstitution error estimator corresponds to choosing the standard empirical measure. Other choices of empirical measure lead to bolstered, posterior-probability, Gaussian-process, and Bayesian error estimators; in addition, we propose bolstered posterior-probability error estimators as a new family of generalized resubstitution estimators. In the two-class case, we show that a generalized resubstitution estimator is consistent and asymptotically unbiased, regardless of the distribution of the features and label, if the corresponding generalized empirical measure converges uniformly to the standard empirical measure and the classification rule has a finite VC dimension. A generalized resubstitution estimator typically has hyperparameters that can be tuned to control its bias and variance, which adds flexibility. Numerical experiments with various classification rules trained on synthetic data assess the thefinite-sample performance of several representative generalized resubstitution error estimators. In addition, results of an image classification experiment using the LeNet-5 convolutional neural network and the MNIST data set demonstrate the potential of this class of error estimators in deep learning for computer vision.