Abstract / Description of output
In task based models, by rethinking parallelism in the paradigm of tasks one reduces synchronisation and decouples the management of parallelism from computation. However existing models typically rely on shared memory, where the programmer expresses input and output dependencies of tasks based upon variables. To then execute these over large scale distributed memory machines requires complex support in the runtime system which the programmer has no control over. We propose an alternative approach where the programmer still works with the concept of tasks but is explicitly aware of the distributed nature of their code and drives interactions through events. Tasks are scheduled and depend upon a number of events arriving, which may originate from tasks running remotely or locally, before they can execute. Events are explicitly “fired” to a target by the programmer with an associated identifier, which is used to match up dependencies, and optionally contain data which tasks can process. This enables the programmer to write large-scale task based codes, still abstracted from the mechanism of parallelism but with a general understanding of how their system is interacting which is useful for optimisation such as locality. Furthermore, as the entire state of the code can be expressed as outstanding events and scheduled tasks, this enables ACID compliance which provides resilience. Our approach works especially well for parallel codes that contain irregular communication patterns, such as in-situ data analytics, and can be applied incrementally to existing MPI based codes to break apart the bulk synchronous nature of the communications.
Original language | English |
---|---|
Number of pages | 1 |
Publication status | Published - 26 Jun 2018 |
Event | International Supercomputing Conference - Frankfurt, Germany Duration: 25 Jun 2018 → 28 Jun 2018 https://www.isc-hpc.com/ |
Conference
Conference | International Supercomputing Conference |
---|---|
Abbreviated title | ISC 2018 |
Country/Territory | Germany |
City | Frankfurt |
Period | 25/06/18 → 28/06/18 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- Communication Optimization
- Programming Models & Languages
- Resiliency
- Scientific Software Development