Shin, Minsuk (2017-08). Priors for Bayesian Shrinkage and High-Dimensional Model Selection. Doctoral Dissertation. Thesis uri icon

abstract

  • This dissertation focuses on the choice of priors in Bayesian model selection and their applied, theoretical and computational aspects. As George Box famously said "all models are wrong, but some are useful"; many statisticians and scientists are aware of the importance of model selection. In a Bayesian perspective, however, it is challenging to choose the prior on the parameters involved in model selection or how to evaluate the criterion on the prior, especially when the number of models to be compared is massive or when a nonparametric model is considered. For high-dimensional Bayesian model selection for linear models, my dissertation studies theoretical perspectives of the choice of the prior on the regression coefficient. Especially, I consider the nonlocal prior densities that assign zero density around the null value, which is typically 0 in model selection settings. When certain regularity conditions apply, I demonstrate that the model selection procedure based on the nonlocal priors is consistent for linear models even when the number of covariates p increases sub-exponentially with the sample size n. I investigate the asymptotic form of the marginal likelihood based on the nonlocal priors and show that it attains a unique penalty term that adapts to the strength of signal corresponding variable in the model, and I remark that this term cannot be attained from local priors such as Gaussian prior densities. Another topic of my dissertation is about computational aspects of Bayesian model selection under high-dimensional settings. A full posterior sampling using existing Markov chain Monte Carlo (MCMC) algorithms to explore high-dimensional model space is highly inefficient and often not feasible from a practical perspective. To overcome this issue, I propose a scalable stochastic search algorithm called Simplified Shotgun Stochastic Search with Screening (S5), which efficiently explores the model space. The S5 algorithm dramatically reduces the computational burden to search the neighborhood of a model by considering a screening step within the algorithm. Its empirical performance is examined in several examples, and it outperforms existing algorithms in the sense that S5 is computationally fast while it efficiently searches the model space. S5 is used to implement the model selection procedures introduced in this dissertation, including linear and nonparametric model selection. The computing functions are provided in the R package BayesS5 in CRAN (https://cran.r-project.org). For nonparametric regression models, I introduce a new shrinkage prior on function spaces, the functional horseshoe prior, that encourages shrinkage towards parametric classes of functions. When the true underlying function is in the parametric class, improved estimation performance is obtained relative to classical nonparametric procedures. The proposed prior also provides a natural penalization interpretation, and casts light on a new class of penalized likelihood methods for function estimation. I theoretically exhibit the efficacy of the proposed approach by showing an adaptive posterior concentration property. The last topic of the dissertation is about a novel extension of the nonlocal idea to functional spaces, called the nonlocal functional prior, which is suitable for nonparametric Bayesian hypothesis testing (model selection) problems. I illustrate the asymptotic rate of the Bayes factor defined by the proposed prior for nonparametric hypothesis testing problems. I apply the proposed prior densities for high-dimensional model selection of nonparametric additive models, and investigate the model selection consistency of the resulting model selection procedure. I provide some simulation studies and real data examples that show that the proposed model selection procedure outperforms state-of-the-art methods in finite samples.
  • This dissertation focuses on the choice of priors in Bayesian model selection and their
    applied, theoretical and computational aspects. As George Box famously said "all models are
    wrong, but some are useful"; many statisticians and scientists are aware of the importance of
    model selection. In a Bayesian perspective, however, it is challenging to choose the prior on the
    parameters involved in model selection or how to evaluate the criterion on the prior, especially
    when the number of models to be compared is massive or when a nonparametric model is
    considered.

    For high-dimensional Bayesian model selection for linear models, my dissertation studies
    theoretical perspectives of the choice of the prior on the regression coefficient. Especially, I
    consider the nonlocal prior densities that assign zero density around the null value, which is
    typically 0 in model selection settings. When certain regularity conditions apply, I demonstrate
    that the model selection procedure based on the nonlocal priors is consistent for linear models
    even when the number of covariates p increases sub-exponentially with the sample size n. I
    investigate the asymptotic form of the marginal likelihood based on the nonlocal priors and
    show that it attains a unique penalty term that adapts to the strength of signal corresponding
    variable in the model, and I remark that this term cannot be attained from local priors such as
    Gaussian prior densities.

    Another topic of my dissertation is about computational aspects of Bayesian model selection
    under high-dimensional settings. A full posterior sampling using existing Markov chain
    Monte Carlo (MCMC) algorithms to explore high-dimensional model space is highly inefficient
    and often not feasible from a practical perspective. To overcome this issue, I propose a
    scalable stochastic search algorithm called Simplified Shotgun Stochastic Search with Screening
    (S5), which efficiently explores the model space. The S5 algorithm dramatically reduces
    the computational burden to search the neighborhood of a model by considering a screening
    step within the algorithm. Its empirical performance is examined in several examples, and it
    outperforms existing algorithms in the sense that S5 is computationally fast while it efficiently
    searches the model space. S5 is used to implement the model selection procedures introduced
    in this dissertation, including linear and nonparametric model selection. The computing functions
    are provided in the R package BayesS5 in CRAN (https://cran.r-project.org).

    For nonparametric regression models, I introduce a new shrinkage prior on function spaces,
    the functional horseshoe prior, that encourages shrinkage towards parametric classes of functions. When the true underlying function is in the parametric class, improved estimation performance is obtained relative to classical nonparametric procedures. The proposed prior also
    provides a natural penalization interpretation, and casts light on a new class of penalized likelihood methods for function estimation. I theoretically exhibit the efficacy of the proposed
    approach by showing an adaptive posterior concentration property.

    The last topic of the dissertation is about a novel extension of the nonlocal idea to functional
    spaces, called the nonlocal functional prior, which is suitable for nonparametric Bayesian hypothesis testing (model selection) problems. I illustrate the asymptotic rate of the Bayes factor
    defined by the proposed prior for nonparametric hypothesis testing problems. I apply the proposed prior densities for high-dimensional model selection of nonparametric additive models,
    and investigate the model selection consistency of the resulting model selection procedure. I
    provide some simulation studies and real data examples that show that the proposed model
    selection procedure outperforms state-of-the-art methods in finite samples.

publication date

  • August 2017