Abstract

Hardware data prefetching is a latency hiding technique that mitigates the memory wall problem by fetching data blocks into caches before the processor demands them. For high performing state-of-the-art data prefetchers, this increases dynamic and static energy in memory hierarchy, due to increase in number of requests. A trivial way to improve energy-efficiency of hardware prefetchers is to prefetch instructions on the critical path of execution. As criticality-based data prefetching does not degrade performance significantly; this is an ideal approach to solve the energy-efficiency problem. We discuss limitations of existing critical instruction detection techniques and propose a new technique that uses re-order buffer occupancy as a metric to detect critical instructions and performs prefetcher-specific threshold tuning. With our detector, we achieve maximum memory hierarchy energy savings of 12.3% with 1.4% higher performance, for PPF, and average as follows: (i) SPEC CPU 2017 benchmarks: 2.04% lower energy, 0.3% lower performance, for IPCP at L1D, (ii) client/server benchmarks: 4.7% lower energy, 0.15% lower performance, for PPF, (iii) Cloudsuite benchmarks: 2.99% lower energy, 0.36% higher performance, for IPCP at L1D. IPCP and PPF are state-of-the-art data prefetchers.

Details