Hierarchical comments-based clustering Conference Paper uri icon

abstract

  • Information resources on the Web like videos, images, and documents are increasingly becoming more "social" through user engagement via commenting systems. These commenting systems provide a forum for users to discuss the resources but have the side effect of providing valuable editorial and contextual information about the resources. In this paper, we explore a comments-driven clustering framework for organizing Web resources according to this user-based perspective. Concretely, we propose a hierarchical comment clustering approach that relies on two key features: (i) comment term normalization and key term extraction for distilling noisy comments for effective clustering; and (ii) a real-time insertion component for incrementally updating the comments-based hierarchy so that resources can be efficiently placed in the hierarchy as comments arise and without the need to re-generate the (potentially) expensive hierarchy. We study the clustering approach over the popular video sharing site YouTube. YouTube is a challenging and difficult environment, notorious for its extremely short, ill-formed, and often unintelligible user-contributed comments. Through extensive experimental study, we find that the proposed approach can lead to effective and efficient comments-based video organizing even in a YouTube-like environment. © 2011 ACM.

author list (cited authors)

  • Hsu, C., Caverlee, J., & Khabiri, E.

citation count

  • 5

editor list (cited editors)

  • Chu, W. C., Wong, W. E., Palakal, M. J., & Hung, C.

publication date

  • January 2011