Comparative study of sequential pattern mining models Chapter

Overview
Research
Identity
Additional Document Info
View All

abstract

The process of finding interesting, novel, and useful patterns from data is now commonly known as Knowledge Discovery and Data mining (KDD). In this paper, we examine closely the problem of mining sequential patterns and propose a general evaluation method to assess the quality of the mined results. We propose four evaluation criteria, namely (1) recoverability, (2) the number of spurious patterns (3) the number of redundant patterns, and (4) the degree of extraneous items in the patterns, to quantitatively assess the quality of the mined results from a wide variety of synthetic datasets with varying randomness and noise levels. Recoverability, a new metric, measures how much of the underlying trend has been detected. Such an evaluation method provides a basis for comparing different models for sequential pattern mining. Furthermore, such evaluation is essential in understanding the performance of approximate solutions. In this paper, the method is employed to conduct a detailed comparison of the traditional frequent sequential pattern model with an alternative approximate pattern model based on sequence alignment. We demonstrate that the alternative approach is able to better recover the underlying patterns with little confounding information under all circumstances we examined, including those where the frequent sequential pattern model fails.

authors

Kum, Hye Chung

author list (cited authors)

Kum, H. C., Paulsen, S., & Wang, W.

complete list of authors

Kum, HC||Paulsen, S||Wang, W

editor list (cited editors)

Lin, T. Y., Ohsuga, S., Liau, C. J., Hu, X., & Tsumoto, S.

Book Title

Foundations of Data Mining and Knowledge Discovery

publication date

September 2005

publisher

Springer Science & Business Media Publisher

published in

Studies in Computational Intelligence Journal