Projects per year
Abstract / Description of output
System performance variability is a significant challenge to scalability of tightly-coupled iterative applications. Asynchronous variants perform better, but an imbalance in progress can result in slower convergence or even failure to converge, as old data is used for updates. In shared memory, this can be countered using progressive load balancing (PLB). We present a distributed memory extension to PLB (DPLB) by running PLB on nodes and adding a balancing layer between nodes. We demonstrate that this method is able to mitigate system performance variation by reducing global progress imbalance 1.08x–4.05x and time to solution variability 1.11x–2.89x. In addition, the method scales without significant overhead to 100 nodes.
Original language | English |
---|---|
Pages (from-to) | 127-136 |
Number of pages | 10 |
Journal | Advances in Parallel Computing |
Volume | 36 |
DOIs | |
Publication status | Published - 1 Apr 2020 |
Event | 2019 International Conference on Parallel Computing - Prague, Czech Republic Duration: 10 Sept 2019 → 13 Sept 2019 https://www.parco.org/ |
Fingerprint
Dive into the research topics of 'Progressive Load Balancing in Distributed Memory'. Together they form a unique fingerprint.Projects
- 1 Finished