Confidence intervals for the true classification error conditioned on the estimated error.

abstract

Bias and variance for small-sample error estimation are typically posed in terms of statistics for the distributions of the true and estimated errors. On the other hand, a salient practical issue asks, given an error estimate, what can be said about the true error? This question relates to the joint distribution of the true and estimated errors, specifically, the conditional expectation of the true error given the error estimate. A critical issue is that of confidence bounds for the true error given the estimate. We consider the joint distribution of the true error and the estimated error, assuming a random feature-label distribution. From it, we derive the marginal distributions, the conditional expectation of the estimated error given the true error, the conditional expectation of the true error given the estimated error, the conditional variance of the true error given the estimated error, and the 95% upper confidence bound for the true error given the estimated error. Numerous classification and estimation rules are considered across a number of models. Massive simulation is used for continuous models and analytic results are derived for discrete classification. We also consider a breast-cancer study to illustrate how the theory might be applied in practice. Although specific results depend on the classification rule, error-estimation rule, and model, some general trends are seen: (I) if the true error is small (large), then the conditional estimated error is generally high (low)-biased; (II) the conditional expected true error tends to be larger (smaller) than the estimated error for small (large) estimated errors; and (III) the confidence bounds tend to be well above the estimated error for low error estimates, becoming much less so for large estimates.

authors

published proceedings

Technol Cancer Res Treat

altmetric score

3

author list (cited authors)

Xu, Q., Hua, J., Braga-Neto, U., Xiong, Z., Suh, E., & Dougherty, E. R.

citation count

18

complete list of authors

Xu, Qian||Hua, Jianping||Braga-Neto, Ulisses||Xiong, Zixinag||Suh, Edward||Dougherty, Edward R

publication date

December 2006

publisher

SAGE Publications Publisher

published in

Technology in Cancer Research and Treatment Journal

Confidence intervals for the true classification error conditioned on the estimated error. Academic Article

Overview

abstract

authors

published proceedings

altmetric score

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

PubMed Central ID

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL