Application of kalman filters to identify unexpected change in blogs Conference Paper uri icon

abstract

  • Information on the Internet, especially blog content, changes rapidly. Users of information collections, such as the blogs hosted by technorati.com, have little, if any, control over the content or frequency of these changes. However, it is important for users to be able to monitor content for deviations in the expected pattern of change. If a user is interested in political blogs and a blog switches subjects to a literary review blog, the user would want to know of this change in behavior. Since pages may change too frequently for manual inspection for "unwanted" changes, an automated approach is wanted. In this paper, we explore methods for indentifying unexpected change by using Kalman filters to model blog behavior over time. Using this model, we examine the history of several blogs and determine methods for flagging the significance of a blog's change from one time step to the next. We are able to predict large deviations in blog content, and allow user-defined sensitivity parameters to tune a statistical threshold of significance for deviation from expectation. Copyright 2008 ACM.

name of conference

  • Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries

published proceedings

  • Proceedings of the 8th ACM/IEEE-CS joint conference on Digital libraries

author list (cited authors)

  • Bogen, P. L., Johnston, J., Karadkar, U. P., Furuta, R., & Shipman, F.

citation count

  • 5

complete list of authors

  • Bogen, Paul Logasa||Johnston, Joshua||Karadkar, Unmil P||Furuta, Richard||Shipman, Frank

publication date

  • June 2008