scTenifoldKnk: a machine learning workflow performing virtual knockout experiments on single-cell gene regulatory networks Academic Article uri icon

abstract

  • AbstractGene knockout (KO) experiments are a proven approach for studying gene function. A typical KO experiment usually involves the phenotypic characterization of KO organisms. The recent advent of single-cell technology has greatly boosted the resolution of cellular phenotyping, providing unprecedented insights into cell-type-specific gene function. However, the use of single-cell technology in large-scale, systematic KO experiments is prohibitive due to the vast resources required. Here we present scTenifoldKnk—a machine learning workflow that performs virtual KO experiments using single-cell RNA sequencing (scRNA-seq) data. scTenifoldKnk first uses data from wild-type (WT) samples to construct a single-cell gene regulatory network (scGRN). Then, a gene is knocked out from the constructed scGRN by setting weights of the gene’s outward edges to zeros. ScTenifoldKnk then compares this “pseudo-KO” scGRN with the original scGRN to identify differentially regulated (DR) genes. These DR genes, also called virtual-KO perturbed genes, are used to assess the impact of the gene KO and reveal the gene’s function in analyzed cells. Using existing data sets, we demonstrate that the scTenifoldKnk analysis recapitulates the main findings of three real-animal KO experiments and confirms the functions of genes underlying three Mendelian diseases. We show the power of scTenifoldKnk as a predictive method to successfully predict the outcomes of two KO experiments that involve intestinal enterocytes in Ahr-/- mice and pancreatic islet cells in Malat1-/- mice, respectively. Finally, we demonstrate the use of scTenifoldKnk to perform systematic KO analyses, in which a large number of genes are virtually deleted, allowing gene functions to be revealed in a cell type-specific manner.HighlightsscTenifoldKnk is a machine learning workflow to perform virtual KO experiments using data from single-cell RNA sequencing (scRNA-seq).scTenifoldKnk requires only one input matrix—a gene-by-cell expression matrix obtained by scRNA-seq from wild-type (WT) samples.scTenifoldKnk constructs a single-cell gene regulatory network (scGRN) from scRNA-seq data of the WT sample, and then produces a pseudo-KO scGRN by setting the KO target gene representation in the WT scGRN to zero.scTenifoldKnk compares the WT scGRN and the pseudo-KO scGRN using a quasi-manifold alignment method, to reveal the perturbation effect of gene KO and generate a perturbation profile for the KO gene.Using real-data examples, we show that scTenifoldKnk is a powerful and effective approach for performing virtual KO experiments with scRNA-seq data to elucidate gene function.SummaryGene knockout (KO) experiments are a proven approach for studying gene function. However, large-scale, systematic KO experiments are prohibitive due to the limitation of experimental and animal resources. Here we present scTenifoldKnk—a machine learning workflow for performing virtual KO experiments with data from single-cell RNA sequencing (scRNA-seq). We show that the scTenifoldKnk virtual KO analysis recapitulates findings from real-animal KO experiments and can be used to predict outcomes from real-animal KO experiments. scTenifoldKnk is a powerful and efficient virtual KO tool for gene function study, allowing a systematic deletion of a large number of genes individually in scRNA-seq data to reveal individual gene function in a cell type-specific manner.

altmetric score

  • 10.1

author list (cited authors)

  • Osorio, D., Zhong, Y., Li, G., Xu, Q., Hillhouse, A., Chen, J., ... Cai, J. J.

citation count

  • 2

complete list of authors

  • Osorio, Daniel||Zhong, Yan||Li, Guanxun||Xu, Qian||Yang, Yongjian||Tian, Yanan||Chapkin, Robert S||Huang, Jianhua Z||Cai, James J

publication date

  • March 2021