Interpretable modeling of time-resolved single-cell gene-protein expression with CrossmodalNet.

abstract

Cell-surface proteins play a critical role in cell function and are primary targets for therapeutics. CITE-seq is a single-cell technique that enables simultaneous measurement of gene and surface protein expression. It is powerful but costly and technically challenging. Computational methods have been developed to predict surface protein expression using gene expression information such as from single-cell RNA sequencing (scRNA-seq) data. Existing methods however are computationally demanding and lack the interpretability to reveal underlying biological processes. We propose CrossmodalNet, an interpretable machine learning model, to predict surface protein expression from scRNA-seq data. Our model with a customized adaptive loss accurately predicts surface protein abundances. When samples from multiple time points are given, our model encodes temporal information into an easy-to-interpret time embedding to make prediction in a time-point-specific manner, and is able to uncover noise-free causal gene-protein relationships. Using three publicly available time-resolved CITE-seq data sets, we validate the performance of our model by comparing it with benchmarking methods and evaluate its interpretability. Together, we show that our method accurately and interpretably profiles surface protein expression using scRNA-seq data, thereby expanding the capacity of CITE-seq experiments for investigating molecular mechanisms involving surface proteins.

authors

Cai, James

published proceedings

Brief Bioinform

author list (cited authors)

Yang, Y., Lin, Y., Li, G., Zhong, Y., Xu, Q., & Cai, J. J.

complete list of authors

Yang, Yongjian||Lin, Yu-Te||Li, Guanxun||Zhong, Yan||Xu, Qian||Cai, James J

publication date

September 2023

publisher

Oxford University Press (OUP) Publisher

published in

Briefings in Bioinformatics Journal

keywords

Algorithms
Gene Expression Profiling
Gene–protein Relationship
Interpretable Machine Learning
Membrane Proteins
Multimodal Single-cell
Scrna-seq
Sequence Analysis, RNA
Single-Cell Analysis

PubMed Central ID

37798250

Digital Object Identifier (DOI)

10.1093/bib/bbad342

start page

bbad342

volume

24

issue

6

URL

http://dx.doi.org/10.1093/bib/bbad342

Interpretable modeling of time-resolved single-cell gene-protein expression with CrossmodalNet. Academic Article

Overview

abstract

authors

published proceedings

author list (cited authors)

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

PubMed Central ID

Digital Object Identifier (DOI)

Additional Document Info

start page

volume

issue

Other

URL