Sharma, Prabal (2013-08). A Branch Predictor Directed Data Cache Prefetcher for Out-of-order and Multicore Processors. Master's Thesis. Thesis uri icon

abstract

  • Modern superscalar pipelines have tremendous capacity to consume the instruction stream. This has been possible owing to improvements in process technology, technology scaling and microarchitectural design improvements that allow programs to speculate past control and data dependencies in the superscalar architecture. However, the speed of the memory subsystem lags behind due to physical constraints in bringing in huge amounts of data to the processor core. Cache hierarchies have subdued the impact of this speed gap; however, there is much that can be still done in improving microarchitecture. Data prefetching techniques bring in memory content significantly before the instruction stream actually witnesses demand misses. However, a majority of the techniques proposed so far depend upon an initial demand miss that initiates a stream of previously identified prefetches. In this thesis, we propose a novel prefetching algorithm, which leverages branch prediction to facilitate deep memory system speculation. The branch predictor directed lookahead mechanism builds a speculative control flow path for the instruction stream about to be fetched by the main superscalar pipeline. Prefetches are generated along this speculative path from a condensed representation of the memory instructions, leveraging register index based correlation. The technique integrates eloquently with the main pipeline's branch predictor to filter out prefetches along invalid speculative paths. Impact of the prefetching scheme is analyzed using out- of-order model of the Gem5 cycle accurate simulator. Evaluation shows that on a set of 13 memory intensive SPEC CPU2006 benchmarks, our prefetching technique improves performance by an average of 5.6% over the baseline out-of-order processor.
  • Modern superscalar pipelines have tremendous capacity to consume the instruction stream. This has been possible owing to improvements in process technology, technology scaling and microarchitectural design improvements that allow programs to speculate past control and data dependencies in the superscalar architecture. However, the speed of the memory subsystem lags behind due to physical constraints in bringing in huge amounts of data to the processor core. Cache hierarchies have subdued the impact of this speed gap; however, there is much that can be still done in improving microarchitecture. Data prefetching techniques bring in memory content significantly before the instruction stream actually witnesses demand misses. However, a majority of the techniques proposed so far depend upon an initial demand miss that initiates a stream of previously identified prefetches.

    In this thesis, we propose a novel prefetching algorithm, which leverages branch prediction to facilitate deep memory system speculation. The branch predictor directed lookahead mechanism builds a speculative control flow path for the instruction stream about to be fetched by the main superscalar pipeline. Prefetches are generated along this speculative path from a condensed representation of the memory instructions, leveraging register index based correlation. The technique integrates eloquently with the main pipeline's branch predictor to filter out prefetches along invalid speculative paths. Impact of the prefetching scheme is analyzed using out- of-order model of the Gem5 cycle accurate simulator. Evaluation shows that on a set of 13 memory intensive SPEC CPU2006 benchmarks, our prefetching technique improves performance by an average of 5.6% over the baseline out-of-order processor.

publication date

  • August 2013