Finding robust linear expression-based classifiers
Overview
Research
Identity
Additional Document Info
Other
View All
Overview
abstract
A key goal for the use of gene-expression microarrays is to perform classification via different expression patterns. The typical small sample obtained and the large numbers of variables make the task of finding good classifiers extremely difficult, from the perspectives of both design and error estimation. This paper addresses the issue of estimation variability, which can result in large numbers of gene sets that have highly optimistic error estimates. It proposes performing classification on probability distributions derived from the original sample points by spreading the mass of those points to make classification more difficult while retaining the basic geometry of the point locations. This is done in a parameterized fashion, based on the degree to which the mass is spread. The method is applied to linear classifiers.