[3.0alpha5] Low global CUDA exclusion count. System is stable without cudaSOAIntegrate or without rigidBonds.

From: Michael Von Domaros (mvondoma_at_uci.edu)
Date: Thu Jul 30 2020 - 18:43:15 CDT

Hi everyone,

I'm simulating a single TIP3P water molecule in some lipids (PME, NPT,
ambient conditions). The system has been running stable on CPUs for
over 20 ns with NAMD 2.14b2. It is also running with NAMD 3.0alpha5 on
a GPU if I use the old offloading scheme. Finally, it continues to run
well with these versions/configs.

Once I turn on cudaSOAIntegrate, set margin to 4, and remove the pair
list updating settings, as recommended, I immediately get:
FATAL ERROR: Low global CUDA exclusion count! (136304 vs 136307)
System unstable or pairlistdist or cutoff too small.

If I turn off rigidBonds, everything is fine again.

Exactly the same system, but with small organic molecules instead of
water, is running stable with 3.0alpha5 and with cudaSOAIntegrate +
rigidBonds on.

So I thought something might be broken with water, but I can simulate
both the apoa1 benchmark and an isolated water molecule (with NAMD
3.0alpha5, cudaSOAIntegrate + rigidBonds on).

I tried increasing pairListDist to 16, reducing the timestep to 0.1
(and at the same time using equal timesteps for all forces),
increasing margin to 8, all without success. Visual inspection of the
starting structure did not reveal anything odd to me.

Does anyone have suggestions on what else I could try?

Thanks a bunch for your help,
Michael

I uploaded the config and output files to:
https://gist.github.com/mvondomaros/0430f2ed7a8409522e3f1f49e897766e

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2020 - 23:17:13 CST