A classification procedure for highly imbalanced class sizes
Additional Document Info
This article develops an effective procedure for handling two-class classification problems with highly imbalanced class sizes. In many imbalanced two-class problems, the majority class represents normal cases, while the minority class represents abnormal cases, detection of which is critical to decision making. When the class sizes are highly imbalanced, conventional classification methods tend to strongly favor the majority class, resulting in very low or even no detection of the minority class. The research objective of this article is to devise a systematic procedure to substantially improve the power of detecting the minority class so that the resulting procedure can help screen the original data set and select a much smaller subset for further investigation. A procedure is developed that is based on ensemble classifiers, where each classifier is constructed from a resized training set with reduced dimension space. In addition, how to find the best values of the decision variables in the proposed classification procedure is specified. The proposed method is compared to a set of off-the-shelf classification methods using two real data sets. The prediction results of the proposed method show remarkable improvements over the other methods. The proposed method can detect about 75% of the minority class units, while the other methods turn out much lower detection rates. 2010 "IIE".