Model-based study of the Effectiveness of Reporting Lists of Small Feature Sets using RNA-Seq Data Conference Paper uri icon

abstract

  • Ranking feature sets for phenotype classification based on gene expression is one of the most challenging issues in bioinformatics. When the number of samples is small, all feature-selection algorithms are known to be unreliable and error estimators suffer to different degrees of imprecision. The problem is compounded by the fact that the accuracy of classification depends on the manner in which the phenomena are transformed into data by the measurement technology. Because Next Generation Sequencing (NGS) technologies amount to a nonlinear transformation of the actual gene or RNA concentrations, they could potentially produce less discriminative data than the actual gene expression levels. Here, we focus on the implications of the non-linear transformation of gene concentrations by the sequencing machine and the choice of error estimators on feature set ranking.

name of conference

  • Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics

published proceedings

  • PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS

author list (cited authors)

  • Kim, E., Ivanov, I., Hua, J., Chapkin, R. S., & Dougherty, E. R.

citation count

  • 0

complete list of authors

  • Kim, Eunji||Ivanov, Ivan||Hua, Jianping||Chapkin, Robert S||Dougherty, Edward R

publication date

  • October 2016