Bandwidth-efficient on-chip interconnect designs for GPGPUs Conference Paper uri icon

abstract

  • © 2015 ACM. Modern computational workloads require abundant thread level parallelism (TLP), necessitating highly-parallel, many-core accelerators such as General Purpose Graphics Processing Units (GPGPUs). GPGPUs place a heavy demand on the on-chIP interconnect between the many cores and a few memory controllers (MCs). Thus, traffic is highly asymmetric, impacting on-chIP resource utilization and system performance. Here, we analyze the communication demands of typical GPGPU applications, and propose efficient Network-on-ChIP (NoC) designs to meet those demands. We show that the proposed schemes improve performance by up to 64.7%. Compared to the best of class prior work, our VC monopolizing and partitioning schemes improve performance by 25%.

author list (cited authors)

  • Jang, H., Kim, J., Gratz, P., Yum, K. H., & Kim, E. J.

citation count

  • 46

publication date

  • June 2015