Constructing Pathway-Based Priors within a Gaussian Mixture Model for Bayesian Regression and Classification. Academic Article

Overview
Research
Identity
Additional Document Info
Other
View All

abstract

Gene-expression-based classification and regression are major concerns in translational genomics. If the feature-label distribution is known, then an optimal classifier can be derived. If the predictor-target distribution is known, then an optimal regression function can be derived. In practice, neither is known, data must be employed, and, for small samples, prior knowledge concerning the feature-label or predictor-target distribution can be used in the learning process. Optimal Bayesian classification and optimal Bayesian regression provide optimality under uncertainty. With optimal Bayesian classification (or regression), uncertainty is treated directly on the feature-label (or predictor-target) distribution. The fundamental engineering problem is prior construction. The Regularized Expected Mean Log-Likelihood Prior (REMLP) utilizes pathway information and provides viable priors for the feature-label distribution, assuming that the training data contain labels. In practice, the labels may not be observed. This paper extends the REMLP methodology to a Gaussian mixture model (GMM) when the labels are unknown. Prior construction bundled with prior update via Bayesian sampling results in Monte Carlo approximations to the optimal Bayesian regression function and optimal Bayesian classifier. Simulations demonstrate that the GMM REMLP prior yields better performance than the EM algorithm for small data sets. We apply it to phenotype classification when the prior knowledge consists of colon cancer pathways.

authors

published proceedings

IEEE/ACM Trans Comput Biol Bioinform

altmetric score

1

author list (cited authors)

Boluki, S., Esfahani, M. S., Qian, X., & Dougherty, E. R.

citation count

26

complete list of authors

Boluki, Shahin||Esfahani, Mohammad Shahrokh||Qian, Xiaoning||Dougherty, Edward R

publication date

March 2019

publisher

Institute of Electrical and Electronics Engineers (IEEE) Publisher

published in

IEEE/ACM Transactions on Computational Biology and Bioinformatics Journal