Fast, Ring-Based Design of 3D Stacked DRAM Conference Paper uri icon

abstract

  • © 2017 IEEE. As computer memory increases in size and processors continue to get faster, the memory subsystem becomes an increasing bottleneck to system performance. To mitigate the relatively slow DRAM memory chip speeds, a new generation of 3D stacked DRAM is being developed, with lower power consumption and higher bandwidth. This paper proposes the use of 3D ring-based data fabrics for fast data transfer between these chips. The ring-based data fabric uses a fast standing wave oscillator to clock its transactions. With a fast clocking scheme, and multiple channels sharing the same bus, more channels are utilized while significantly reducing the number of through-silicon vias (TSVs). Experimental results show that our ring-based data fabric can reduce read latencies by almost 4X compared to traditional stacked memory chips. Variations of our scheme can also reduce power consumption compared to traditional memory stacks. Our Memory Architecture using a Ring-based Scheme (MARS) can effectively trade off power, throughput, and latency to improve system performance for different application spaces. We show that our MARS variants can deliver better latency (up to ~4X), power (up to ~8X), and performance per watt (up to ~4X) over HBM, when averaged over 11 SPEC CPU 2006 benchmarks. Other MARS variants provide higher throughput with similar power consumption compared to Wide I/O memory.

author list (cited authors)

  • Douglass, A. J., & Khatri, S. P.

citation count

  • 1

publication date

  • November 2017

publisher