Re: Please ignore previous poor formatted: difficulty converting namd2 channel system to namd3 Atom velocity too fast, box too small errors.

From: Vermaas, Josh (vermaasj_at_msu.edu)
Date: Wed Jul 21 2021 - 09:56:47 CDT

Hi Ryan,

In my experience, NAMD3.0a9 can restart from 2.14 inputs, and I’ve done it for membrane systems. However, I’ll admit that I don’t use CHARMM-GUI’s recommended scheme for equilibrating the membrane, which I think is too conservative. There isn’t anything obviously wrong with your input deck. One thing I’d check is the intermediate structures, particularly near the indices that NAMD is barfing on. One thing I *have* noticed is that minimization with 3.0a9 is broken, with or without CUDASOAIntegrate. In some cases, it is possible to end up with very unphysical structures, and the structures rattle themselves apart once dynamics starts. Otherwise, you may have an inadvertent ring piercing, which will also cause dynamics to blow up.

-Josh

From: <owner-namd-l_at_ks.uiuc.edu> on behalf of Ryan Woltz <rlwoltz_at_ucdavis.edu>
Reply-To: "namd-l_at_ks.uiuc.edu" <namd-l_at_ks.uiuc.edu>, Ryan Woltz <rlwoltz_at_ucdavis.edu>
Date: Tuesday, July 20, 2021 at 7:18 PM
To: "namd-l_at_ks.uiuc.edu" <namd-l_at_ks.uiuc.edu>
Subject: namd-l: Please ignore previous poor formatted: difficulty converting namd2 channel system to namd3 Atom velocity too fast, box too small errors.

Dear community,

      I have a 300k atom membrane embedded channel system that is stable on NAMD2.14 but wanted to upgrade to take advantage of the increased speed. I downloaded NAMD3.0 alpha9. I'm also using the V100 GPUs on EXPANSE if that matters. I've had several errors which I fixed a few but not sure how it affects my system, I don't expect they did but I'll state them as I go along just in case. Almost all errors relate to cudasoaintegrate.

        My system is set up as follows. Minimize, equilibrate system (step6.1-6.6), slowly release restraints on protein to prevent large RMSD jumps (Step7.1-7.13), production (step7.14).

A. My first attempt was to use NAMD3 to continue a NAMD2.14 run that was in production for 30ns. This failed immediately with error:
OPENING EXTENDED SYSTEM TRAJECTORY FILE
FATAL ERROR: CUDA cuRAND error curandGenerateNormal(gen, gaussrand_x, n, 0, 1) in file src/SequencerCUDAKernel.cu, function langevinVelocitiesBBK2, line 4263
 on Pe 0 (exp-12-57 device 0 pci 0:af:0): status value 202
>From what I gathered from posts you cannot continue a NAMD2 simulation with NAMD3. Or more specifically the post said I cannot continue a simulation that did not previously include cudasoaintegrate turned on.

Working with NAMD3 from beginning:

  1. I then tried starting it from scratch by separating the minimize step and the equilibration steps (6.1-6.6) but the cudasoaintegrate turned on is not compatible with reassigntemp or reassignfreq.
  2. I then turned off cudasoaintegrate for steps 6.1-6.6 and then turned it on for steps 7.1 and beyond as the steps with protein restraints on the CA atoms which are slowly released in steps 7.1-7.13 do not require reassignment. I used suggested options such as 1) margin 8 2) outputEnergies/outputTiming= 400 3) pairlistpercycle = 4 4) stepspercycle = 40. Simulation fails quickly with atoms moving too fast.

C. I then readjusted outputEnergies/outputTiming to 5000 (charmm-gui default). However, step7.1 fails after 105000 steps due to fatal error Periodic cell has become too small for original patch grid! Possible solutions are to restart from a recent checkpoint, increase margin, or disable useFlexibleCell for liquid simulation..

D. I played with the margins 0-20 and once I no longer got the atoms escaping error I then got the allocated memory exceeded, too many atoms in a patch error.

E. Finally I also took out pairlistpercycle and stepspercycle as noted on NAMD3 website that these are obsolete. Now the error of ERROR: Atoms moving too fast at timestep 135902; simulation has become unstable (0 atoms on pe 0).
FATAL ERROR: SequencerCUDA: Atoms moving too fast

I was able to collect bits and pieces from the forums but mostly it is on NAMD2 or on NAMD3 but the errors were similar but not the same. From error E) I don’t think I’m running out of RAM since I have 93GB allocated and it runs fine on NAMD2.

I’ve gotten most of my information from Nvidia’s website and NAMD3 website and adjusted .inp files based on these sites.

https://www.ks.uiuc.edu/Research/namd/alpha/3.0alpha/$>

https://urldefense.com/v3/__https://developer.nvidia.com/blog/delivering-up-to-9x-throughput-with-namd-v3-and-a100-gpu/__;!!DZ3fjg!pzNFZMqbqHgZlePLlsMKpMe7xJFUG2HT_z4yBeOrGyjMpR2ybbN8byC8sBwKqoM20g$ <https://urldefense.com/v3/__https:/developer.nvidia.com/blog/delivering-up-to-9x-throughput-with-namd-v3-and-a100-gpu/__;!!DZ3fjg!vYzmDPKUph4SSFqEco-Zik5G2d8jAy4sSMUXSJNjpiIRmc_eGEhoPYi1DYgw47POAA$>

I have a suspicion that the reason things are failing after equilibration is because I’m turning on cudasoaintegrate after dynamics has started. However, I don’t know how to equilibrate with cudasoaintegrate on and reassignTemp/reassingFreq. I’ve worked a year to get this system stable so don’t want to play around too much with options I’m unfamiliar with. Again system is stable with NAMD2 and most of the errors I get is failure due to the cudasoaintegrate option on. If any of these steps or errors could be fixed even if there was an option to do steps6.1-7.13 without cudasoaintegrate on and turn it on for the production I’d be happy. I’m also wondering if cudasoaintegrate doesn’t like restraints on the protein as I’ve been told by others that use very early versions.

Any suggestions on how to fix any of these errors? Do I just need to keep playing with margin/ outputenergy/ outputtimeing/ pairlistpercycle/ stepspercycle parameters until it works?

Attachments

https://urldefense.com/v3/__https://drive.google.com/drive/folders/1NgNWLrDFQLdcB77I0U9U_sGD2KVEoJ_X?usp=sharing__;!!DZ3fjg!pzNFZMqbqHgZlePLlsMKpMe7xJFUG2HT_z4yBeOrGyjMpR2ybbN8byC8sBwdL14RcA$ <https://urldefense.com/v3/__https:/drive.google.com/drive/folders/1NgNWLrDFQLdcB77I0U9U_sGD2KVEoJ_X?usp=sharing__;!!DZ3fjg!vYzmDPKUph4SSFqEco-Zik5G2d8jAy4sSMUXSJNjpiIRmc_eGEhoPYi1DYhaurOdNQ$>

Thank you,

Ryan

This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:11 CST