Gromacs has been
the fastest molecular dynamics code in serial or parallel runs with
some tens of processors due to highly optimised code and in particular
inner force loops which have been coded in assembly and utilise the
SSE instructions. However, now it has been shown that in a modern
supercomputer equipped with a very fast interconnect (the Cray
Seastar2) Gromacs also scales to hundreds of processors. During the
course Gromacs achieved sustained performance of 1.1 Tflops using
384 cores of actual Gromacs throughput computation which amounts to 48 ns/day. The benchmark system was a box of 108000 SPC water molecules,
and the long range interactions were dealt with using reaction-field
for electrostatics with cut-off of 1.2 nm.
In some cases using cut-offs for the electrostatics is an unsuitable
approximation. However, the Particle Mesh Ewald (PME)-scheme for
accurately accounting for the electrostatics now also scales to
hundreds of processors in Louhi. This was demonstrated with a a lipid
bilayer system of 4096 lipids, which together with the water molecules
totals 487424 atoms (the benchmark DPPC-system times 4).
Electrostatics were treated with PME using a cut-off of 1.8 nm and 1.0
nm for vdW. Using 1056 cores to simulate this system extracted
1.15 Tflops from Louhi providing 23 ns/day of simulation.
For more information:
- GROMACS