Scaling Techniques for Massive Scale-Free Graphs in Distributed (External) Memory Conference Paper uri icon

abstract

  • We present techniques to process large scale-free graphs in distributed memory. Our aim is to scale to trillions of edges, and our research is targeted at leadership class supercomputers and clusters with local non-volatile memory, e.g., NAND Flash. We apply an edge list partitioning technique, designed to accommodate high-degree vertices (hubs) that create scaling challenges when processing scale-free graphs. In addition to partitioning hubs, we use ghost vertices to represent the hubs to reduce communication hotspots. We present a scaling study with three important graph algorithms: Breadth-First Search (BFS), K-Core decomposition, and Triangle Counting. We also demonstrate scalability on BG/P Intrepid by comparing to best known Graph500 results. We show results on two clusters with local NVRAM storage that are capable of traversing trillion-edge scale-free graphs. By leveraging node-local NAND Flash, our approach can process thirty-two times larger datasets with only a 39% performance degradation in Traversed Edges Per Second (TEPS). 2013 IEEE.

name of conference

  • 2013 IEEE 27th International Symposium on Parallel and Distributed Processing

published proceedings

  • IEEE 27TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2013)

author list (cited authors)

  • Pearce, R., Gokhale, M., & Amato, N. M.

citation count

  • 45

complete list of authors

  • Pearce, Roger||Gokhale, Maya||Amato, Nancy M

publication date

  • May 2013