Investigation in to the optimal 3D FFT strategy for Quantum Mechanical Codes

Adrian Jackson, Antonia Collis

Research output: Contribution to conferenceAbstractpeer-review

Abstract / Description of output

We will present an analysis of 3D FFT methods scaling from matrices of 1283 to 10243. We will show how the performance is limited by the memory, so larger systems can only be on multiple nodes, necessitating the use of network interconnect and discuss the implications of using small FFTs limited to single nodes and the associated profitability of performing the same calculation multiple times to avoid network communication. This will be of particular importance to GPU implementations of software packages in order to reduce the requirement of moving data on and off GPU accelerators.

We will discuss our investigation into the optimal construction of 3D FFTs by using either a 3D FFT library function call or using multiple 1D FFTs. We will also consider the equivalent GPU directives by using cu-FFT library calls. Critically, the performance of 3D FFTs relies heavily on the performance of the all-to-all communications of a particular architecture.

Finally we will discuss a case study of FFT performance in CASTEP, in its current format and also by using data replication to reduce communication of FFT routines across nodes. Performance can be boosted by distributing small FFT calculations on processes (16 cores on HECToR), then nodes (two processes, with a total of 32 cores) and finally by blades (4 nodes with 128 cores). We discuss the benefits for various simulations in CASTEP of using fewer processes and minimising the use of the interconnect of HECToR.
Original languageEnglish
Publication statusPublished - 9 Apr 2013
EventExascale Applications and Software Conference - Edinburgh, United Kingdom
Duration: 9 Apr 201311 Apr 2013


ConferenceExascale Applications and Software Conference
Country/TerritoryUnited Kingdom


Dive into the research topics of 'Investigation in to the optimal 3D FFT strategy for Quantum Mechanical Codes'. Together they form a unique fingerprint.

Cite this