A criterion for choosing between full-sample and hold-out classifier design Conference Paper uri icon

abstract

  • Is it better to design a classifier and estimate its error on the full sample or to design a classifier on a training subset and estimate its error on the hold-out test subset? Full-sample design provides the better classifier; nevertheless, one might choose hold-out with the hope of better error estimation. A conservative criterion to decide the best course is to aim at a classifier whose error is less than a given bound. Then the choice between full-sample and hold-out design depends on which possesses the smaller expected bound. Using this criterion, we examine the choice between hold-out and several full-sample error estimators using covariance models. The relation between the two designs is revealed via a decomposition of the expected bound into the sum of the expected true error and the expected conditional standard deviation of the true error. 2008 IEEE.

name of conference

  • 2008 IEEE International Workshop on Genomic Signal Processing and Statistics

published proceedings

  • 2008 IEEE INTERNATIONAL WORKSHOP ON GENOMIC SIGNAL PROCESSING AND STATISTICS

author list (cited authors)

  • Brun, M., Xu, Q., & Dougherty, E. R.

citation count

  • 2

complete list of authors

  • Brun, Marcel||Xu, Qian||Dougherty, Edward R

publication date

  • June 2008