Bayesian data integration and variable selection for pan-cancer survival prediction using protein expression data.

abstract

Accurate prognostic prediction using molecular information is a challenging area of research, which is essential to develop precision medicine. In this paper, we develop translational models to identify major actionable proteins that are associated with clinical outcomes, like the survival time of patients. There are considerable statistical and computational challenges due to the large dimension of the problems. Furthermore, data are available for different tumor types; hence data integration for various tumors is desirable. Having censored survival outcomes escalates one more level of complexity in the inferential procedure. We develop Bayesian hierarchical survival models, which accommodate all the challenges mentioned here. We use the hierarchical Bayesian accelerated failure time model for survival regression. Furthermore, we assume sparse horseshoe prior distribution for the regression coefficients to identify the major proteomic drivers. We borrow strength across tumor groups by introducing a correlation structure among the prior distributions. The proposed methods have been used to analyze data from the recently curated "The Cancer Proteome Atlas" (TCPA), which contains reverse-phase protein arrays-based high-quality protein expression data as well as detailed clinical annotation, including survival times. Our simulation and the TCPA data analysis illustrate the efficacy of the proposed integrative model, which links different tumors with the correlated prior structures.

authors

Mallick, Bani

published proceedings

Biometrics

author list (cited authors)

Maity, A. K., Bhattacharya, A., Mallick, B. K., & Baladandayuthapani, V.

citation count

18

complete list of authors

Maity, Arnab Kumar||Bhattacharya, Anirban||Mallick, Bani K||Baladandayuthapani, Veerabhadran

publication date

March 2020

publisher

Oxford University Press (OUP) Publisher

published in

Biometrics Journal

keywords

Aft Regression
Bayes Theorem
Biometry
Borrowing Strength
Computer Simulation
Data Interpretation, Statistical
Horseshoe
Humans
Kidney Neoplasms
Markov Chains
Models, Statistical
Monte Carlo Method
Neoplasms
Pan-cancer Model
Prognosis
Protein Array Analysis
Proteome
Proteomics
Survival Analysis
Tcpa

Digital Object Identifier (DOI)

10.1111/biom.13132

start page

316

end page

325

volume

76

issue

1

URL

http://dx.doi.org/10.1111/biom.13132

user-defined tag

3 Good Health and Well-Being

Bayesian data integration and variable selection for pan-cancer survival prediction using protein expression data.

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL

user-defined tag