Projects per year
In this paper, we report progress in implementing a hybrid OpenMP–MPI version of the unstructured finite element code Fluidity. For matrix assembly kernels, the OpenMP parallel algorithm uses graph colouring to identify independent sets of elements that can be assembled concurrently with no race conditions. In this phase there are no MPI overheads as each MPI process only assembles its own local part of the global matrix. We use an OpenMP threaded fork of PETSc to solve the resulting sparse linear systems of equations. We experiment with a range of preconditioners, including HYPRE which provides the algebraic multigrid preconditioner BoomerAMG where the smoother is also threaded. Since unstruc- tured finite element codes are well known to be memory latency bound, particular attention is paid to ccNUMA architectures where data locality is particularly important to achieve good intra-node scaling characteristics. We also demonstrate that utilising non-blocking algorithms and libraries are critical to mixed-mode application so that it can achieve better parallel performance than the pure MPI version.
- Matrix assembly
- Sparse linear solver
FingerprintDive into the research topics of 'Developing a scalable hybrid MPI/OpenMP unstructured finite element model'. Together they form a unique fingerprint.
Developing the multi-level parallelisms for Fluidity-ICOM -- Paving the way to exascale for the next generation geophysical fluid modelling technologyGuo, X., Gorman, G., Lange, M., Mitchell, L. & Weiland, M., 2015, (Accepted/In press).
Research output: Contribution to conference › Paper › peer-review
Exploring the thread-level parallelism for the next generation geophysical fluid modelling framework Fluidity-ICOMGuo, X., Gorman, G., Lange, M., Mitchell, L. & Weiland, M., 2013, In: Procedia Engineering. 61, p. 251-257
Research output: Contribution to journal › Article › peer-reviewOpen AccessFile