Is there correlation between the estimated and true classification errors in small-sample settings? Conference Paper uri icon

abstract

  • The validity of a classifier model, consisting of a trained classifier and it estimated error, depends upon the relationship between the estimated and true errors of the classifier. Absent a good error estimation rule, the classifier-error model lacks scientific meaning. This paper demonstrates that in high-dimensionality feature selection settings in the context of small samples there can be virtually no correlation between the true and estimated errors. This conclusion has serious ramifications in the domain of high-throughput genomic classification, such as gene-expression classification, where the number of potential features (gene expressions) is usually in the tens of thousands and the number of sample points (microarrays) is often under one hundred. 2007 IEEE.

name of conference

  • 2007 IEEE/SP 14th Workshop on Statistical Signal Processing

published proceedings

  • 2007 IEEE/SP 14TH WORKSHOP ON STATISTICAL SIGNAL PROCESSING, VOLS 1 AND 2

author list (cited authors)

  • Ranczar, B., Hua, B. J., & Dougherty, E. R.

citation count

  • 0

complete list of authors

  • Ranczar, Blaise||Hua, B Jianping||Dougherty, Edward R

publication date

  • August 2007