Walk Parallel Algorithm for the Barnes-Hut Treecode on GPUs – Towards Cost Effective, High Performance N-Body Simulation

Tsuyoshi Hamada, Khaled Benkrid

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

Recently, general-purpose computation on graphics processing units (GPGPU) has become an increasingly
popular field of study as graphics processing units (GPUs)
continue to be proposed as high performance and relatively
low cost implementation platforms for scientific computing
applications. Among these applications figure astrophysical
N-body simulations, which form one of the most challenging problems in computational science. However, in most
reported studies, a simple O(N2) algorithm was used for
GPGPUs, and the resulting performances were not observed to be better than those of conventional CPUs that were
based on more optimized O(N log N) algorithms such as
the tree algorithm or the particle-particle particle-mesh algorithm. Because of the difficulty in getting efficient implementations of such algorithms on GPUs, a GPU cluster had no practical advantage over general-purpose PC
clusters for N-body simulations. In this paper, we report
a new method for efficient parallel implementation of the
tree algorithm on GPUs. Our novel tree code allows the
realization of an N-body simulation on a GPU cluster at
a much higher performance than that on general PC clusters.
We practically performed a cosmological simulation with
562 million particles on a GPU cluster using 128 NVIDIA
GeForce 8800GTS GPUs at an overall cost of 168 172 $.
We obtained a sustained performance of 20.1 Tflops, which
when normalized against a general-purpose CPU implementation leads to a performance of 8.50 Tflops. The achieved
cost/performance was hence a mere $19.8 /Gflops which
shows the high competitiveness of GPGPUs.
Original languageEnglish
Article numberVolume 24, Numbers 1-2
Pages (from-to)21
Number of pages31
JournalComputer Science - Research and Development
Volume24
Issue number12
Early online date20 May 2009
Publication statusPublished - 20 May 2009

Keywords / Materials (for Non-textual outputs)

  • GPU · N-body simulation · Tree algorithm

Fingerprint

Dive into the research topics of 'Walk Parallel Algorithm for the Barnes-Hut Treecode on GPUs – Towards Cost Effective, High Performance N-Body Simulation'. Together they form a unique fingerprint.

Cite this