A Bayesian approach to nonlinear probit gene selection and classification
- Additional Document Info
- View All
We consider the problem of gene selection and classification based on the expression data. Specifically, we propose a bootstrap Bayesian gene selection method for nonlinear probit regression. A binomial probit regression model with data augmentation is used to transform the binomial problem into a sequence of smoothing problems. The probit regressor is approximated as a nonlinear combination of the genes. A Gibbs sampler is employed to find the strongest genes. Some numerical techniques to speed up the computation are discussed. We then develop a nonlinear probit Bayesian classifier consisting of a linear term plus a nonlinear term, the parameters of which are estimated using the sequential Monte Carlo technique. These new methods are applied to analyze several data sets, including the hereditary breast cancer data, the small round blue-cell tumor data, and the acute leukemia tumor data. The experimental results show the proposed methods can effectively find important genes which are consistent with the existing biological belief, and the classification accuracies are very high. Some robustness and sensitivity properties of the proposed methods are also discussed to deal with noisy microarray data. © 2004 The Franklin Institute. Published by Elsevier Ltd. All rights reserved.
author list (cited authors)
Zhou, X., Wang, X., & Dougherty, E. R.