Knight, Jason Matthew (2015-05). Optimal Model-Based Approaches for Predictive Inference in Biology. Doctoral Dissertation. Thesis uri icon

abstract

  • Predictive modeling of the dynamic, multivariate, non-linear, stochastic systems of biology is a difficult enterprise. High throughput measurement techniques are enabling new approaches to computational biology, but the small number of samples typically available relative to the number of features measured make additional sources of information critical for accurate predictions. In this dissertation, we offer an approach to incorporate biological pathway knowledge into a predictive stochastic model for genetic regulatory networks. In addition, we propose a statistical model for shotgun sequencing and use computational approximation strategies to derive optimal estimators for classification. We perform comparisons of classifiers trained using this framework to other existing classification rules including non-linear support vector machines. Using both synthetic and real sequencing data, our classifiers delivered lower classification error rates than existing classification techniques. In addition, we demonstrate using prior knowledge to construct the classifier through properly constructed prior distributions and several scenarios where this increases classification performance. This research establishes a flexible framework to generate optimal estimators with respect to statistical biological models. By demonstrating the role and power of computation in unlocking these estimators, we point future research efforts towards this computationally intensive approach for the computational biology field.
  • Predictive modeling of the dynamic, multivariate, non-linear, stochastic systems of biology is a difficult enterprise. High throughput measurement techniques are enabling new approaches to computational biology, but the small number of samples typically available relative to the number of features measured make additional sources of information critical for accurate predictions. In this dissertation, we offer an approach to incorporate biological pathway knowledge into a predictive stochastic model for genetic regulatory networks. In addition, we propose a statistical model for shotgun sequencing and use computational approximation strategies to derive optimal estimators for classification.

    We perform comparisons of classifiers trained using this framework to other existing classification rules including non-linear support vector machines. Using both synthetic and real sequencing data, our classifiers delivered lower classification error rates than existing classification techniques. In addition, we demonstrate using prior knowledge to construct the classifier through properly constructed prior distributions and several scenarios where this increases classification performance. This research establishes a flexible framework to generate optimal estimators with respect to statistical biological models. By demonstrating the role and power of computation in unlocking these estimators, we point future research efforts towards this computationally intensive approach for the computational biology field.

publication date

  • May 2015