Abstract
The rising complexity of large-scale heterogeneous architectures, such as those composed of off-the-shelf processors coupled with fixed-function logic, has imposed challenges for traditional simulation methodologies. While prior work has explored trace-based simulation techniques that offer good trade-offs between simulation accuracy and speed, most such proposals are limited to simulating chip multiprocessors (CMPs) with up to hundreds of threads. There exists a gap for a framework that can flexibly and accurately model different heterogeneous systems, as well as scales to a larger number of cores.
We implement a solution called HETSIM, a trace-driven, synchronization and dependency-aware framework for fast and accurate pre-silicon performance and power estimations for heterogeneous systems with up to thousands of cores. HETSIM operates in four stages: compilation, emulation, trace generation and trace replay. Given (i) a specification file, (ii) a multi-threaded implementation of the target application, and (iii) an architectural and power model of the target hardware, HETSIM generates performance and power estimates with no further user intervention. HETSIM distinguishes itself from existing approaches through emulation of target hardware functionality as software primitives. HETSIM is packaged with primitives that are commonplace across many accelerator designs, and the framework can easily be extended to support custom primitives.
We demonstrate the utility of HETSIM through design-space exploration on two recent target architectures: (i) a reconfigurable many-core accelerator, and (ii) a heterogeneous, domain-specific accelerator. Overall, HETSIM demonstrates simulation time speedups of 3.2×-10.4× (average 5.0×) over gem5 in syscall emulation mode, with average deviations in simulated time and power consumption of 15.1% and 10.9%, respectively. HETSIM is validated against silicon for the second target and estimates performance within a deviation of 25.5%, on average. Index Terms—architectural simulation, trace-driven simulation, binary instrumentation, heterogeneous architectures
We implement a solution called HETSIM, a trace-driven, synchronization and dependency-aware framework for fast and accurate pre-silicon performance and power estimations for heterogeneous systems with up to thousands of cores. HETSIM operates in four stages: compilation, emulation, trace generation and trace replay. Given (i) a specification file, (ii) a multi-threaded implementation of the target application, and (iii) an architectural and power model of the target hardware, HETSIM generates performance and power estimates with no further user intervention. HETSIM distinguishes itself from existing approaches through emulation of target hardware functionality as software primitives. HETSIM is packaged with primitives that are commonplace across many accelerator designs, and the framework can easily be extended to support custom primitives.
We demonstrate the utility of HETSIM through design-space exploration on two recent target architectures: (i) a reconfigurable many-core accelerator, and (ii) a heterogeneous, domain-specific accelerator. Overall, HETSIM demonstrates simulation time speedups of 3.2×-10.4× (average 5.0×) over gem5 in syscall emulation mode, with average deviations in simulated time and power consumption of 15.1% and 10.9%, respectively. HETSIM is validated against silicon for the second target and estimates performance within a deviation of 25.5%, on average. Index Terms—architectural simulation, trace-driven simulation, binary instrumentation, heterogeneous architectures
Original language | English |
---|---|
Title of host publication | 2020 IEEE International Symposium on Workload Characterization (IISWC) |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 13-24 |
Number of pages | 12 |
ISBN (Electronic) | 978-1-7281-7645-1 |
ISBN (Print) | 978-1-7281-7646-8 |
DOIs | |
Publication status | Published - 19 Nov 2020 |
Event | 2020 IEEE International Symposium on Workload Characterization - Virtual Conference Duration: 27 Oct 2020 → 28 Oct 2020 http://www.iiswc.org/iiswc2020/index.html |
Symposium
Symposium | 2020 IEEE International Symposium on Workload Characterization |
---|---|
Abbreviated title | IISWC 2020 |
City | Virtual Conference |
Period | 27/10/20 → 28/10/20 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- architectural simulation
- trace-driven simulation
- binary instrumentation
- heterogeneous architectures