Abstract
Computational memory (CM) is a promising approach for accelerating inference on neural networks (NN) by using enhanced memories that, in addition to storing data, allow computations on them. One of the main challenges of this approach is defining a hardware/software interface that allows a compiler to map NN models for efficient execution on the underlying CM accelerator. This is a non-trivial task because efficiency dictates that the CM accelerator is explicitly programmed as a dataflow engine where the execution of the different NN layers form a pipeline. In this paper, we present our work towards a software stack for executing ML models on such a multi-core CM accelerator. We describe an architecture for the hardware and software, and focus on the problem of implementing the appropriate control logic so that data dependencies are respected. We propose a solution to the latter that is based on polyhedral compilation.
Original language | English |
---|---|
Number of pages | 8 |
Publication status | Published - 27 Apr 2020 |
Event | The 10th Workshop on Systems for Post-Moore Architectures - Heraklion, Greece Duration: 27 Apr 2020 → 27 Apr 2020 Conference number: 10 |
Workshop
Workshop | The 10th Workshop on Systems for Post-Moore Architectures |
---|---|
Abbreviated title | SPMA 2020 |
Country/Territory | Greece |
City | Heraklion |
Period | 27/04/20 → 27/04/20 |