Small-sample precision of ROC-related estimates.

abstract

MOTIVATION: The receiver operator characteristic (ROC) curves are commonly used in biomedical applications to judge the performance of a discriminant across varying decision thresholds. The estimated ROC curve depends on the true positive rate (TPR) and false positive rate (FPR), with the key metric being the area under the curve (AUC). With small samples these rates need to be estimated from the training data, so a natural question arises: How well do the estimates of the AUC, TPR and FPR compare with the true metrics? RESULTS: Through a simulation study using data models and analysis of real microarray data, we show that (i) for small samples the root mean square differences of the estimated and true metrics are considerable; (ii) even for large samples, there is only weak correlation between the true and estimated metrics; and (iii) generally, there is weak regression of the true metric on the estimated metric. For classification rules, we consider linear discriminant analysis, linear support vector machine (SVM) and radial basis function SVM. For error estimation, we consider resubstitution, three kinds of cross-validation and bootstrap. Using resampling, we show the unreliability of some published ROC results. AVAILABILITY: Companion web site at http://compbio.tgen.org/paper_supp/ROC/roc.html CONTACT: edward@mail.ece.tamu.edu.

authors

Dougherty, Edward

published proceedings

Bioinformatics

altmetric score

15.33

author list (cited authors)

Hanczar, B., Hua, J., Sima, C., Weinstein, J., Bittner, M., & Dougherty, E. R.

citation count

215

complete list of authors

Hanczar, Blaise||Hua, Jianping||Sima, Chao||Weinstein, John||Bittner, Michael||Dougherty, Edward R

publication date

March 2010

publisher

Oxford University Press (OUP) Publisher

published in

Bioinformatics Journal

keywords

Algorithms
False Positive Reactions
Oligonucleotide Array Sequence Analysis
Pattern Recognition, Automated
ROC Curve

PubMed Central ID

20130029

Digital Object Identifier (DOI)

10.1093/bioinformatics/btq037

start page

822

end page

830

volume

26

issue

6

URL

http%3A%2F%2Fdx.doi.org%2F10.1093%2Fbioinformatics%2Fbtq037

Overview

abstract

authors

published proceedings

altmetric score

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

PubMed Central ID

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL