Modeling Randomized Data Streams in Caching, Data Processing, and Crawling Applications Conference Paper uri icon

abstract

  • 2015 IEEE. Many BigData applications (e.g., MapReduce, web caching, search in large graphs) process streams of random key-value records that follow highly skewed frequency distributions. In this work, we first develop stochastic models for the probability to encounter unique keys during exploration of such streams and their growth rate over time. We then apply these models to the analysis of LRU caching, MapReduce overhead, and various crawl properties (e.g., node-degree bias, frontier size) in random graphs.

name of conference

  • 2015 IEEE Conference on Computer Communications (INFOCOM)

published proceedings

  • 2015 IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (INFOCOM)

author list (cited authors)

  • Ahmed, S. T., & Loguinov, D.

citation count

  • 1

complete list of authors

  • Ahmed, Sarker Tanzir||Loguinov, Dmitri

publication date

  • January 2015