Transient crowd discovery on the real-time social web Conference Paper uri icon

abstract

  • In this paper, we study the problem of automatically discovering and tracking transient crowds in highly-dynamic social messaging systems like Twitter and Facebook. Unlike the more static and long-lived group-based membership offered on many social networks (e.g., fan of the LA Lakers), a transient crowd is a short-lived ad-hoc collection of users, representing a "hotspot" on the real-time web. Successful detection of these hotspots can positively impact related research directions in online event detection, content personalization, social information discovery, etc. Concretely, we propose to model crowd formation and dispersion through a message-based communication clustering approach over time-evolving graphs that captures the natural conversational nature of social messaging systems. Two of the salient features of the proposed approach are (i) an efficient localitybased clustering approach for identifying crowds of users in near real-time compared to more heavyweight static clustering algorithms; and (ii) a novel crowd tracking and evolution approach for linking crowds across time periods. We find that the locality-based clustering approach results in empirically high-quality clusters relative to static graph clustering techniques at a fraction of the computational cost. Based on a three month snapshot of Twitter consisting of 711,612 users and 61.3 million messages, we show how the proposed approach can successfully identify and track interesting crowds based on the Twitter communication structure and uncover crowd-based topics of interest. Copyright 2011 ACM.

author list (cited authors)

  • Kamath, K. Y., & Caverlee, J.

citation count

  • 11

editor list (cited editors)

  • King, I., Nejdl, W., & Li, H.

publication date

  • January 2011