Sampling for Approximate Bipartite Network Projection Conference Paper uri icon

abstract

  • Bipartite graphs manifest as a stream of edges that represent transactions, e.g., purchases by retail customers. Recommender systems employ neighborhood-based measures of node similarity, such as the pairwise number of common neighbors (CN) and related metrics. While the number of node pairs that share neighbors is potentially enormous, only a relatively small proportion of them have many common neighbors. This motivates finding a weighted sampling approach to preferentially sample these node pairs. This paper presents a new sampling algorithm that provides a fixed size unbiased estimate of the similarity matrixresulting from a bipartite edge stream projection. The algorithm has two components. First, it maintains a reservoir of sampled bipartite edges with sampling weights that favor selection of high similarity nodes. Second, arriving edges generate a stream of similarity updates, based on their adjacency with the current sample. These updates are aggregated in a second reservoir sample-based stream aggregator to yield the final unbiased estimate. Experiments on real world graphs show that a 10% sample at each stage yields estimates of high similarity edges with weighted relative errors of about 1%.

name of conference

  • Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence

published proceedings

  • PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE

author list (cited authors)

  • Ahmed, N. K., Duffield, N., & Xia, L.

citation count

  • 4

complete list of authors

  • Ahmed, Nesreen K||Duffield, Nick||Xia, Liangzhen

publication date

  • July 2018