Micro-core architectures combine many low memory, low power computing cores together in a single package. These can be used as a co-processor or standalone but due to limited on-chip memory and esoteric nature of the hardware, writing efficient parallel codes for them is challenging. We previously developed ePython, a low memory (24Kb) implementation of Python supporting the rapid development of parallel Python codes and education for these architectures. In this poster we present our work on an offload abstraction to support the use of micro-cores as an accelerator. Programmers decorate specific functions in their Python code, running under any Python interpreter on the host, with our underlying technology then responsible for the low-level data movement, scheduling and execution of kernels on the micro-cores. Aimed at education and fast prototyping, a machine learning code for detecting lung cancer, where computational kernels are offloaded to micro-cores, is used to illustrate the approach.
- Tools for CSE/HPC/DA Education
- Programming Environments/Languages for CSE/HPC/DA Education
- Tools for parallel program development (e.g., debuggers, and integrated development environments)
- Parallel programming languages, libraries, models and notations
- Runtime systems as they interact with programming systems