Optimal convex error estimators for classification

abstract

A cross-validation error estimator is obtained by repeatedly leaving out some data points, deriving classifiers on the remaining points, computing errors for these classifiers on the left-out points, and then averaging these errors. The 0.632 bootstrap estimator is obtained by averaging the errors of classifiers designed from points drawn with replacement and then taking a convex combination of this "zero bootstrap" error with the resubstitution error for the designed classifier. This gives a convex combination of the low-biased resubstitution and the high-biased zero bootstrap. Another convex error estimator suggested in the literature is the unweighted average of resubstitution and cross-validation. This paper treats the following question: Given a feature-label distribution and classification rule, what is the optimal convex combination of two error estimators, i.e. what are the optimal weights for the convex combination. This problem is considered by finding the weights to minimize the MSE of a convex estimator. It also considers optimality under the constraint that the resulting estimator be unbiased. Owing to the large amount of results coming from the various feature-label models and error estimators, a portion of the results are presented herein and the main body of results appears on a companion website. In the tabulated results, each table treats the classification rules considered for the model, various Bayes errors, and various sample sizes. Each table includes the optimal weights, mean errors and standard deviations for the relevant error measures, and the MSE and MAE for the optimal convex estimator. Many observations can be made by considering the full set of experiments. Some general trends are outlined in the paper. The general conclusion is that optimizing the weights of a convex estimator can provide substantial improvement, depending on the classification rule, data model, sample size and component estimators. Optimal convex bootstrap estimators are applied to feature-set ranking to illustrate their potential advantage over non-optimized convex estimators. 2006 Pattern Recognition Society.

authors

Dougherty, Edward

published proceedings

PATTERN RECOGNITION

author list (cited authors)

Sima, C., & Dougherty, E. R.

citation count

12

complete list of authors

Sima, Chao||Dougherty, Edward R

publication date

September 2006

publisher

Elsevier Publisher

published in

Pattern Recognition Journal

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL