RE: Decreasing performance of cluster running FEP

From: Vermaas, Joshua (Joshua.Vermaas_at_nrel.gov)
Date: Wed Jul 11 2018 - 12:21:07 CDT

What is the hardware on your cluster? FEP is not accelerated with GPUs. Neither are colvars, which is I think where the problem may actually be. How many atoms are in your colvar definitions?

-Josh

On 2018-07-10 23:36:51-06:00 owner-namd-l_at_ks.uiuc.edu wrote:

Hello:
I am observing a marked decrease in the performance of a NextScale cluster running a FEP for protein-ligand, previously equilibrated for over 100ns. No such problems when running MD equilibration on the same system. Code NAMD 2.12, ad hoc compilation in house with Intel 2016 (NAMD2.12, compiled on more recent Intel, available as module at the cluster, proved unable to run a FEP)

The system is made of ca 460 residues in water, FEP 0.2-1.0 lambda 0.025 (32 windows), preeq 175,000/numSteps 750,000, ts=1.0fs.
FEP on 4 nodes/144core (optimal for scaling) starts with 0.0078/step performance until window 3. Thereafter 0.014/step until window 5, thereafter 0.021 until present window 9. The ligand, under modest r/angle/dih colvars, remains in place with no detectable rotation or distortion. Slowdown is such that it becomes extremely expensive carrying out a FEP, even if divided in two sectors like now. same problems for FEP 0.0-0.2.

I observed the same problem when running FEP on the ligand alone in water on one node /36 core.

In all cases, letting the code writing on disk less frequently did not help--_000_SN6PR09MB2926EC8038C6EA3FB2A7C03CE45A0SN6PR09MB2926namp_--

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2018 - 23:21:16 CST