Hierarchical prefetching: A software-hardware instruction prefetcher for server applications

Tingji Zhang, Boris Grot, Wenjian He, Yashuai Lv, Peng Qu*, Fang Su, Wenxin Wang, Guowei Zhang, Xuefeng Zhang, Youhui Zhang*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The large working set of instructions in server-side applications causes a significant bottleneck in the front-end, even for high-performance processors equipped with fetch-directed instruction prefetching (FDIP). Prefetchers specifically designed for server scenarios typically rely on a record-and-replay mechanism that exploits the repetitiveness of instruction sequences. However, the efficacy of these techniques is compromised by discrepancies between actual and predicted control flows, resulting in loss of coverage and timeliness. This paper proposes Hierarchical Prefetching, a novel approach that tackles the limitations of existing prefetchers. It identifies common coarse-grained functionality blocks (called Bundles) within the server code and prefetches them as a whole. Bundles are significantly larger than typical prefetch targets, encompassing tens to hundreds of kilobytes of code. The approach combines simple software analysis of code for bundle formation and light-weight hardware for record-and-replay prefetching. The prefetcher requires under 2KB of on-chip storage by keeping most of the metadata in main memory. Experiments with 11 popular server workloads reveal that Hierarchical Prefetching significantly improves miss coverage and timeliness over prior techniques, achieving a 6.6% average performance gain over FDIP.
Original languageEnglish
Title of host publicationProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
Place of PublicationNew York, NY, USA
PublisherAssociation for Computing Machinery (ACM)
Pages529-544
Number of pages16
Volume2
ISBN (Electronic)9798400710797
DOIs
Publication statusPublished - 30 Mar 2025
EventThe 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems - Postillion Hotel & Convention Center WTC Rotterdam, Rotterdam, Netherlands
Duration: 30 Mar 20253 Apr 2025
Conference number: 30
https://www.asplos-conference.org/asplos2025/

Conference

ConferenceThe 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
Abbreviated titleASPLOS '25
Country/TerritoryNetherlands
CityRotterdam
Period30/03/253/04/25
Internet address

Keywords / Materials (for Non-textual outputs)

  • front-end bottleneck
  • instruction prefetching
  • microarchitecture

Fingerprint

Dive into the research topics of 'Hierarchical prefetching: A software-hardware instruction prefetcher for server applications'. Together they form a unique fingerprint.

Cite this