You are where you tweet Conference Paper uri icon

abstract

  • We propose and evaluate a probabilistic framework for estimating a Twitter user's city-level location based purely on the content of the user's tweets, even in the absence of any other geospatial cues. By augmenting the massive human-powered sensing capabilities of Twitter and related microblogging services with content-derived location information, this framework can overcome the sparsity of geo-enabled features in these services and enable new location-based personalized information services, the targeting of regional advertisements, and so on. Three of the key features of the proposed approach are: (i) its reliance purely on tweet content, meaning no need for user IP information, private login information, or external knowledge bases; (ii) a classification component for automatically identifying words in tweets with a strong local geo-scope; and (iii) a lattice-based neighborhood smoothing model for refining a user's location estimate. The system estimates k possible locations for each user in descending order of confidence. On average we find that the location estimates converge quickly (needing just 100s of tweets), placing 51% of Twitter users within 100 miles of their actual location. © 2010 ACM.

altmetric score

  • 21.25

author list (cited authors)

  • Cheng, Z., Caverlee, J., & Lee, K.

citation count

  • 628

editor list (cited editors)

  • Huang, J., Koudas, N., Jones, G., Wu, X., Collins-Thompson, K., & An, A.

publication date

  • January 2010