Zhao, Wentao (2008-08). Genomic applications of statistical signal processing. Doctoral Dissertation. Thesis uri icon

abstract

  • Biological phenomena in the cells can be explained in terms of the interactions among biological macro-molecules, e.g., DNAs, RNAs and proteins. These interactions can be modeled by genetic regulatory networks (GRNs). This dissertation proposes to reverse engineering the GRNs based on heterogeneous biological data sets, including time-series and time-independent gene expressions, Chromatin ImmunoPrecipatation (ChIP) data, gene sequence and motifs and other possible sources of knowledge. The objective of this research is to propose novel computational methods to catch pace with the fast evolving biological databases. Signal processing techniques are exploited to develop computationally efficient, accurate and robust algorithms, which deal individually or collectively with various data sets. Methods of power spectral density estimation are discussed to identify genes participating in various biological processes. Information theoretic methods are applied for non-parametric inference. Bayesian methods are adopted to incorporate several sources with prior knowledge. This work aims to construct an inference system which takes into account different sources of information such that the absence of some components will not interfere with the rest of the system. It has been verified that the proposed algorithms achieve better inference accuracy and higher computational efficiency compared with other state-of-the-art schemes, e.g. REVEAL, ARACNE, Bayesian Networks and Relevance Networks, at presence of artificial time series and steady state microarray measurements. The proposed algorithms are especially appealing when the the sample size is small. Besides, they are able to integrate multiple heterogeneous data sources, e.g. ChIP and sequence data, so that a unified GRN can be inferred. The analysis of biological literature and in silico experiments on real data sets for fruit fly, yeast and human have corroborated part of the inferred GRN. The research has also produced a set of potential control targets for designing gene therapy strategies.
  • Biological phenomena in the cells can be explained in terms of the interactions among
    biological macro-molecules, e.g., DNAs, RNAs and proteins. These interactions can
    be modeled by genetic regulatory networks (GRNs). This dissertation proposes to
    reverse engineering the GRNs based on heterogeneous biological data sets, including
    time-series and time-independent gene expressions, Chromatin ImmunoPrecipatation
    (ChIP) data, gene sequence and motifs and other possible sources of knowledge. The
    objective of this research is to propose novel computational methods to catch pace
    with the fast evolving biological databases.
    Signal processing techniques are exploited to develop computationally efficient,
    accurate and robust algorithms, which deal individually or collectively with various
    data sets. Methods of power spectral density estimation are discussed to identify
    genes participating in various biological processes. Information theoretic methods are
    applied for non-parametric inference. Bayesian methods are adopted to incorporate several sources with prior knowledge. This work aims to construct an inference system
    which takes into account different sources of information such that the absence of some
    components will not interfere with the rest of the system.
    It has been verified that the proposed algorithms achieve better inference accuracy
    and higher computational efficiency compared with other state-of-the-art schemes,
    e.g. REVEAL, ARACNE, Bayesian Networks and Relevance Networks, at presence
    of artificial time series and steady state microarray measurements. The proposed algorithms
    are especially appealing when the the sample size is small. Besides, they are
    able to integrate multiple heterogeneous data sources, e.g. ChIP and sequence data,
    so that a unified GRN can be inferred. The analysis of biological literature and in
    silico experiments on real data sets for fruit fly, yeast and human have corroborated
    part of the inferred GRN. The research has also produced a set of potential control
    targets for designing gene therapy strategies.

publication date

  • August 2008