Missing-value estimation using linear and non-linear regression with Bayesian gene selection.

abstract

MOTIVATION: Data from microarray experiments are usually in the form of large matrices of expression levels of genes under different experimental conditions. Owing to various reasons, there are frequently missing values. Estimating these missing values is important because they affect downstream analysis, such as clustering, classification and network design. Several methods of missing-value estimation are in use. The problem has two parts: (1) selection of genes for estimation and (2) design of an estimation rule. RESULTS: We propose Bayesian variable selection to obtain genes to be used for estimation, and employ both linear and nonlinear regression for the estimation rule itself. Fast implementation issues for these methods are discussed, including the use of QR decomposition for parameter estimation. The proposed methods are tested on data sets arising from hereditary breast cancer and small round blue-cell tumors. The results compare very favorably with currently used methods based on the normalized root-mean-square error. AVAILABILITY: The appendix is available from http://gspsnap.tamu.edu/gspweb/zxb/missing_zxb/ (user: gspweb; passwd: gsplab).

authors

Dougherty, Edward

published proceedings

Bioinformatics

author list (cited authors)

Zhou, X., Wang, X., & Dougherty, E. R.

citation count

56

complete list of authors

Zhou, Xiaobo||Wang, Xiaodong||Dougherty, Edward R

publication date

November 2003

publisher

Oxford University Press (OUP) Publisher

published in

Bioinformatics Journal

keywords

Algorithms
Artifacts
Bayes Theorem
Brca2 Protein
Breast Neoplasms
Carrier Proteins
Gene Deletion
Gene Expression Profiling
Genetic Variation
Humans
Linear Models
Models, Genetic
Models, Statistical
Nonlinear Dynamics
Oligonucleotide Array Sequence Analysis
Regression Analysis
Ubiquitin-Protein Ligases

PubMed Central ID

14630659

Digital Object Identifier (DOI)

10.1093/bioinformatics/btg323

start page

2302

end page

2307

volume

19

issue

17

URL

http%3A%2F%2Fdx.doi.org%2F10.1093%2Fbioinformatics%2Fbtg323

Missing-value estimation using linear and non-linear regression with Bayesian gene selection. Academic Article

Overview

abstract

authors

published proceedings

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

PubMed Central ID

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL