A Parameterized Approach to Spam-Resilient Link Analysis of the Web
- Additional Document Info
- View All
Link-based analysis of the Web provides the basis for many important applications - like Web search, Web-based data mining, and Web page categorization - that bring order to the massive amount of distributed Web content. Due to the overwhelming reliance on these important applications, there is a rise in efforts to manipulate (or spam) the link structure of the Web. In this manuscript, we present a parameterized framework for link analysis of the Web that promotes spam resilience through a source-centric view of the Web. We provide a rigorous study of the set of critical parameters that can impact source-centric link analysis and propose the novel notion of influence throttling for countering the influence of link-based manipulation. Through formal analysis and a large-scale experimental study, we show how different parameter settings may impact the time complexity, stability, and spam resilience of Web link analysis. Concretely, we find that the source-centric model supports more effective and robust rankings in comparison with existing Web algorithms such as PageRank. 2009 IEEE.
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS
author list (cited authors)
Caverlee, J., Webb, S., Liu, L., & Rouse, W. B.
complete list of authors
Caverlee, James||Webb, Steve||Liu, Ling||Rouse, William B