Abstract / Description of output
The effort to reduce address translation overheads has typically targeted data accesses since they constitute the overwhelming portion of the second-level TLB (STLB) misses in desktop and HPC applications. The address translation cost of instruction accesses has been relatively neglected due to historically small instruction footprints. However, state-of-the-art datacenter and server applications feature massive instruction footprints owing to deep software stacks, resulting in high STLB miss rates for instruction accesses.
This paper demonstrates that instruction address translation is a performance bottleneck in server workloads. In response, we propose Morrigan, a microarchitectural instruction STLB prefetcher whose design is based on new insights regarding instruction STLB misses. At the core of Morrigan there is an ensemble of table-based Markov prefetchers that build and store variable length Markov chains out of the instruction STLB miss stream. Morrigan further employs a sequential prefetcher and a scheme that exploits page table locality to maximize miss coverage. An important contribution of the work is showing that access frequency is more important than access recency when choosing replacement candidates. Based on this insight, Morrigan introduces a new replacement policy that identifies victims in the Markov prefetchers using a frequency stack while adapting to phase-change behavior. On a set of 45 industrial server workloads, Morrigan eliminates 69% of the memory references in demand page walks triggered by instruction STLB misses and improves geometric mean performance by 7.6%.
This paper demonstrates that instruction address translation is a performance bottleneck in server workloads. In response, we propose Morrigan, a microarchitectural instruction STLB prefetcher whose design is based on new insights regarding instruction STLB misses. At the core of Morrigan there is an ensemble of table-based Markov prefetchers that build and store variable length Markov chains out of the instruction STLB miss stream. Morrigan further employs a sequential prefetcher and a scheme that exploits page table locality to maximize miss coverage. An important contribution of the work is showing that access frequency is more important than access recency when choosing replacement candidates. Based on this insight, Morrigan introduces a new replacement policy that identifies victims in the Markov prefetchers using a frequency stack while adapting to phase-change behavior. On a set of 45 industrial server workloads, Morrigan eliminates 69% of the memory references in demand page walks triggered by instruction STLB misses and improves geometric mean performance by 7.6%.
Original language | English |
---|---|
Title of host publication | MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture |
Place of Publication | New York, NY, United States |
Publisher | Association for Computing Machinery (ACM) |
Pages | 1138-1153 |
Number of pages | 16 |
ISBN (Electronic) | 9781450385572 |
DOIs | |
Publication status | Published - 17 Oct 2021 |
Event | 54th IEEE/ACM International Symposium on Microarchitecture - Online, Athens, Greece Duration: 18 Oct 2021 → 22 Oct 2022 https://www.microarch.org/micro54/index.php |
Conference
Conference | 54th IEEE/ACM International Symposium on Microarchitecture |
---|---|
Abbreviated title | MICRO 2021 |
Country/Territory | Greece |
City | Athens |
Period | 18/10/21 → 22/10/22 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- virtual memory
- address translation
- translation lookaside buffer
- TLB prefetching
- TLB management
- markov prefetching