Fads and fallacies in the name of small-sample microarray classification - A highlight of misunderstanding and erroneous usage in the applications of genomic signal processing Academic Article uri icon

abstract

  • The development in the field of genomic signal processing (GSP) because of DNA microarray technology has generated large number of publications and an equally large amount of unsound and unsubstantiated scientific hypotheses. The area of focus include classifier design, feature selection and error estimation. Some of the most common fads and fallacies regarding classification methods include complex classification rules are better than simple ones, adding more variables lead to better classifier performance, consistent classification rules are better than inconsistent ones, all features of interest can be found by doing univariate statistical tests, the optimal feature set can be found by a nonexhaustive search, using an independent test set is always the best approach, and resubstitution is useless, except in large sample and asymptotic studies.

author list (cited authors)

  • Braga-Neto, U.

citation count

  • 44

publication date

  • January 2007