The impact of delay on the design of branch predictors Conference Paper uri icon

abstract

  • Modern microprocessors employ increasingly complicated branch predictors to achieve instruction fetch bandwidth that is sufficient for wide out-of-order execution cores. While existing predictors can still be accessed in a single clock cycle, recent studies show that slower wires and faster clock rates will require multi-cycle access times to large on-chip structures, such as branch prediction tables. Thus, future branch predictors must consider not only area and accuracy, but also delay. This paper explores these tradeoffs in designing branch predictors and shows that increased accuracy alone cannot overcome the penalties in delay that arise with larger predictor structures. We evaluate three schemes for accommodating delay: a caching approach, an overriding approach, and a cascading lookahead approach. While we use a common branch predictor, gshare, as the prediction component, these schemes can be constructed using most types of predictors.

name of conference

  • Proceedings 33rd Annual IEEE/ACM International Symposium on Microarchitecture. MICRO-33 2000

published proceedings

  • 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture

author list (cited authors)

  • Jimenez, D. A., Keckler, S. W., & Lin, C.

citation count

  • 10

complete list of authors

  • Jimenez, DA||Keckler, SW||Lin, C

publication date

  • January 2000