Date: Tue Feb 12 2019 - 09:47:57 CST

Multi-GPU simulations are still bottlenecked by the CPU integrator:

I have a hunch that faster DDR4 RAM should give a good price/performance return vs additional GPUs (beyond 2) but I have yet to find relevant benchmarks.

If there is enough interest I may just do the benchmarking myself and publish the results online.

Hi Stefano,

I did a quick test on some nodes I have access to (36 CPUs (2xXeon Gold 6154), 2 Tesla V100s), and here are some results I can share.

Small, 36000 atom system:
1GPU 115ns/day
2GPU 130ns/day

STMV benchmark (1M atoms):
1GPU 3.4ns/day
2GPU 4.2ns/day
1GPU, only half the CPUs: 3.0ns/day

Someone who has done more thorough benchmarking might know more, but there are two possibilities for why this might be:
1) Contention for PCI-bus resources between the GPUs, since there is only so much communication bandwidth available, and architecturally I think there is still work that needs to be done each step on the CPU that requires the coordinates to be shuttled across the bus.
2) Not enough CPUs.

I *think* it is more option 1 than option 2 based on the results if you starve STMV for CPUs. Once PCIE 4.0 and its associated parts come out, this might get better. If I had to build myself a new machine today, it would probably have only one 2070 or 2080, not very much RAM (16GB is more than enough for a NAMD compute node that isn't doing anything crazy), and a reasonably high clockspeed multicore processor. It kind of depends on your budget as to what the exact balance ends up being.


On 2019-01-31 10:21:05-07:00 wrote:

Dear all,
I am trying to set a new workstation and I would like to know if there is a significant improvement in performance with two gpus (gtx 1080 ti or rtx 2080) rather than just one, and eventually with which cpu/ram requisite.
Thanks in advance for any advice and suggestions

