The Effect of Stemming on Arabic Text Classification: An Empirical Study

abstract

The information world is rich of documents in different formats or applications, such as databases, digital libraries, and the Web. Text classification is used for aiding search functionality offered by search engines and information retrieval systems to deal with the large number of documents on the web. Many research papers, conducted within the field of text classification, were applied to English, Dutch, Chinese, and other languages, whereas fewer were applied to Arabic language. This paper addresses the issue of automatic classification or classification of Arabic text documents. It applies text classification to Arabic language text documents using stemming as part of the preprocessing steps. Results have showed that applying text classification without using stemming; the support vector machine (SVM) classifier has achieved the highest classification accuracy using the two test modes with 87.79% and 88.54%. On the other hand, stemming has negatively affected the accuracy, where the SVM accuracy using the two test modes dropped down to 84.49% and 86.35%.

authors

Alsmadi, Izzat

published proceedings

INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH

altmetric score

0.25

author list (cited authors)

Wahbeh, A., Al-Kabi, M., Al-Radaideh, Q., Al-Shawakfa, E., & Alsmadi, I.

citation count

3

complete list of authors

Wahbeh, Abdullah||Al-Kabi, Mohammed||Al-Radaideh, Qasem||Al-Shawakfa, Emad||Alsmadi, Izzat

publication date

July 2011

publisher

IGI Global Publisher

published in

International Journal of Information Retrieval Research Journal

keywords

Arabic Text Classification
Decision Tree
Naive Bayes Classifier (nb)
Natural Language Processing
Stemming
Support Vector Machine (svm)
Text Classification

Digital Object Identifier (DOI)

10.4018/ijirr.2011070104

start page

54

end page

70

volume

1

issue

3

URL

http://dx.doi.org/10.4018/ijirr.2011070104

The Effect of Stemming on Arabic Text Classification: An Empirical Study Academic Article

Overview

abstract

authors

published proceedings

altmetric score

author list (cited authors)

citation count

complete list of authors

publication date

publisher

published in

Research

keywords

Identity

Digital Object Identifier (DOI)

Additional Document Info

start page

end page

volume

issue

Other

URL