Small-Sample Error Estimation for Bagged Classification Rules

abstract

Application of ensemble classification rules in genomics and proteomics has become increasingly common. However, the problem of error estimation for these classification rules, particularly for bagging under the small-sample settings prevalent in genomics and proteomics, is not well understood. Breiman proposed the "out-of-bag" method for estimating statistics of bagged classifiers, which was subsequently applied by other authors to estimate the classification error. In this paper, we give an explicit definition of the out-of-bag estimator that is intended to remove estimator bias, by formulating carefully how the error count is normalized. We also report the results of an extensive simulation study of bagging of common classification rules, including LDA, 3NN, and CART, applied on both synthetic and real patient data, corresponding to the use of common error estimators such as resubstitution, leave-one-out, cross-validation, basic bootstrap, bootstrap 632, bootstrap 632 plus, bolstering, semi-bolstering, in addition to the out-of-bag estimator. The results from the numerical experiments indicated that the performance of the out-of-bag estimator is very similar to that of leave-one-out; in particular, the out-of-bag estimator is slightly pessimistically biased. The performance of the other estimators is consistent with their performance with the corresponding single classifiers, as reported in other studies. Copyright 2010 T. T. Vu and U. M. Braga-Neto.

authors

Braga Neto, Ulisses

published proceedings

EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING

author list (cited authors)

Vu, T. T., & Braga-Neto, U. M.

citation count

1

complete list of authors

Vu, TT||Braga-Neto, UM

publication date

December 2010

publisher

Springer Nature Publisher

published in

EURASIP Journal on Advances in Signal Processing Journal

keywords

46 Information And Computing Sciences
4603 Computer Vision And Multimedia Computation

Digital Object Identifier (DOI)

10.1155/2010/548906

URI

https://hdl.handle.net/1969.1/180777

start page

548906

volume

2010

issue

1

URL

http://dx.doi.org/10.1155/2010/548906

Small-Sample Error Estimation for Bagged Classification Rules Academic Article

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

URI

Additional Document Info

start page

volume

issue

Other

URL