Projects per year
In this paper, we highlight our progress in implementing a hybrid OpenMP-MPI version of the unstructured finite element appli- cation Fluidity-ICOM. We demonstrate that utilising non-blocking algorithms and libraries are critical to mixed-mode application so that it can achieve better parallel performance than the pure MPI version. In the matrix assembly kernels, the OpenMP parallel algorithm utilises graph colouring to identify independent sets of elements that can be assembled simultaneously with no race con- ditions. The TCMalloc are used here to tackle performance issues arising from automatic arrays memory allocations. The sparse linear systems defined by various equations are solved by using threaded PETSc and HYPRE is utilised as a threaded preconditioner through the PETSc interface. Since unstructured finite element codes are well known to be memory bound, particular attention has to be paid to ccNUMA architectures where data locality is particularly important to achieve good intra-node scaling characteristics. With mixed mode MPI/OpenMP, Fluidity-ICOM can now run well above 32K cores job, which offers Fluidity-ICOM capability to solve the ”grand-challenge” problems.
FingerprintDive into the research topics of 'Exploring the thread-level parallelism for the next generation geophysical fluid modelling framework Fluidity-ICOM'. Together they form a unique fingerprint.
- 1 Article
Guo, X., Lange, M., Gorman, G., Mitchell, L. & Weiland, M., 30 Mar 2015, In: Computers and Fluids. 110, p. 227–234 8 p.
Research output: Contribution to journal › Article › peer-review