DSphere: A Source-Centric Approach to Crawling, Indexing and Searching the World Wide Web Conference Paper uri icon

abstract

  • We describe DSPHERE1 - a decentralized system for crawling, indexing, searching and ranking of documents in the World Wide Web. Unlike most of the existing search technologies that depend heavily on a page-centric view of the Web, we advocate a source-centric view of the Web and propose a decentralized architecture for crawling, indexing and searching the Web in a distributed source-specific fashion. A. fully decentralized crawler is developed to crawl the World Wide Web where each peer is assigned the responsibility of crawling a specific set of documents referred to as a source collection. Link analysis techniques are used for ranking documents. Traditional link analysis techniques suffer from problems like slow refresh rate and vulnerabilities to Web Spam. We propose a source-based link analysis approach, which computes fast and accurate ranking scores for all crawled documents. 2007 IEEE.

name of conference

  • 2007 IEEE 23rd International Conference on Data Engineering

published proceedings

  • 2007 IEEE 23rd International Conference on Data Engineering

author list (cited authors)

  • Bamba, B., Liu, L., Caverlee, J., Padliya, V., Srivatsa, M., Bansal, T., ... Singh, A.

citation count

  • 2

complete list of authors

  • Bamba, B||Liu, Ling||Caverlee, J||Padliya, V||Srivatsa, M||Bansal, T||Palekar, M||Patrao, J||Li, Suiyang||Singh, A

editor list (cited editors)

  • Chirkova, R., Dogac, A., Özsu, M. T., & Sellis, T. K.

publication date

  • January 2007