SparseNCA: Sparse Network Component Analysis for Recovering Transcription Factor Activities with Incomplete Prior Information.
Additional Document Info
Network component analysis (NCA) is an important method for inferring transcriptional regulatory networks (TRNs) and recovering transcription factor activities (TFAs) using gene expression data, and the prior information about the connectivity matrix. The algorithms currently available crucially depend on the completeness of this prior information. However, inaccuracies in the measurement process may render incompleteness in the available knowledge about the connectivity matrix. Hence, computationally efficient algorithms are needed to overcome the possible incompleteness in the available data. We present a sparse network component analysis algorithm (sparseNCA), which incorporates the effect of incompleteness in the estimation of TRNs by imposing an additional sparsity constraint using the norm, which results in a greater estimation accuracy. In order to improve the computational efficiency, an iterative re-weighted method is proposed for the NCA problem which not only promotes sparsity but is hundreds of times faster than the norm based solution. The performance of sparseNCA is rigorously compared to that of FastNCA and NINCA using synthetic data as well as real data. It is shown that sparseNCA outperforms the existing state-of-the-art algorithms both in terms of estimation accuracy and consistency with the added advantage of low computational complexity. The performance of sparseNCA compared to its predecessors is particularly pronounced in case of incomplete prior information about the sparsity of the network. Subnetwork analysis is performed on the E.coli data which reiterates the superior consistency of the proposed algorithm.