A Decoupled KILO-Instruction Processor Conference Paper

Overview
Identity
Additional Document Info
Other
View All

abstract

Building processors with large instruction windows has been proposed as a mechanism for overcoming the memory wall, but finding a feasible and implementable design has been an elusive goal. Traditional processors are composed of structures that do not scale to large instruction windows because of timing and power constraints. However, the behavior of programs executed with large instruction windows gives rise to a natural and simple alternative to scaling. We characterize this phenomenon of execution locality and propose a microarchitecture to exploit it to achieve the benefit of a large instruction window processor with low implementation cost. Execution locality is the tendency of instructions to exhibit high or low latency based on their dependence on memory operations. In this paper we propose a decoupled microarchitecture that executes low latency instructions on a Cache Processor and high latency instructions on a Memory Processor. We demonstrate that such a design, using small structures and many in-order components, can achieve the same performance as much more aggressive proposals while minimizing design complexity. 2006 IEEE.

name of conference

The Twelfth International Symposium on High-Performance Computer Architecture, 2006.

authors

Jimenez, Daniel

published proceedings

The Twelfth International Symposium on High-Performance Computer Architecture, 2006.

author list (cited authors)

Perics, M., Cristal, A., Gonzlez, R., Jimnez, D. A., & Valero, M.

citation count

21

complete list of authors

Pericàs, Miquel||Cristal, Adrian||González, Ruben||Jiménez, Daniel A||Valero, Mateo

publication date

January 2006

publisher

Institute of Electrical and Electronics Engineers (IEEE) Publisher

published in

Proceedings - International Symposium on High-Performance Computer Architecture Journal