Sampling Techniques for Large, Dynamic Graphs Conference Paper uri icon

abstract

  • Peer-to-peer systems are becoming increasingly popular, with millions of simultaneous users and a wide range of applications. Understanding existing systems and devising new peer-to-peer techniques relies on access to representative models derived from empirical observations. Due to the large and dynamic nature of these systems, directly capturing global behavior is often impractical. Sampling is a natural approach for learning about these systems, and most previous studies rely on it to collect data. This paper addresses the common problem of selecting representative samples of peer properties such as peer degree, link bandwidth, or the number of files shared. A good sampling technique will select any of the peers present with equal probability. However, common sampling techniques introduce bias in two ways. First, the dynamic nature of peers can bias results towards short-lived peers, much as naively sampling flows in a router can lead to bias towards short-lived flows. Second, the heterogeneous overlay topology can lead to bias towards high-degree peers. We present preliminary evidence suggesting that applying a degree-correction method to random walk-based peer selection leads to unbiased sampling, at the expense of a loss of efficiency.

name of conference

  • Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications

published proceedings

  • Proceedings IEEE INFOCOM 2006. 25TH IEEE International Conference on Computer Communications

author list (cited authors)

  • Stutzbach, D., Rejaie, R., Duffield, N., Sen, S., & Willinger, W.

citation count

  • 27

complete list of authors

  • Stutzbach, Daniel||Rejaie, Reza||Duffield, Nick||Sen, Subhabrata||Willinger, Walter

publication date

  • April 2006