Optimal Bayesian MMSE Estimation of the Coefficient of Determination for Discrete Prediction
Additional Document Info
The coefficient of determination (CoD) has significant applications in genomics, for example, in the inference of gene regulatory networks. In previous publications, we have studied several nonparametric CoD estimators, based upon the resubstitution, leave-one-out, cross-validation, and bootstrap error estimators, and one parametric maximum-likelihood (ML) CoD estimator that allows the incorporation of available prior knowledge, from a frequentist perspective. However, none of these CoD estimators are rigorously optimized based on statistical inference across a family of possible distributions. Therefore, by following the idea of Bayesian error estimation for classification, we define a Bayesian CoD estimator that minimizes the mean-square error (MSE), based on a parametrized family of joint distributions between predictors and target as a function of random parameters characterized by assumed prior distributions. We derive an exact formulation of the sample-based Bayesian MMSE CoD estimator. Numerical experiments are carried out to estimate performance metrics of the Bayesian CoD estimator and compare them against those of resubstitution, leave-one-out, bootstrap and cross-validation CoD estimators over all the distributions, by employing the Monte Carlo sample method. Results show that the Bayesian CoD estimator has the best performance, displaying zero bias, small variance, and least root mean-square error (RMS). 2013 IEEE.
name of conference
2013 IEEE International Workshop on Genomic Signal Processing and Statistics