From: John Stone (johns_at_ks.uiuc.edu)
Date: Fri May 06 2022 - 00:36:13 CDT

Chris,
  I've just re-tested the latest VMD build on a Cray XC50 using Slurm,
and here's what a launch looks like on that system with MPI (Cray mpich):

The first one enables OpenGL via an EGL context:

stonej_at_daint103:~/hivmovie/vrmovie2016> srun -A g33 -C gpu -p debug -n 2 --ntasks-per-node=1 /users/stonej/local/bin/vmd194a58 -dispdev openglpbuffer -e `pwd`/rendermovie.tcl
srun: job 38239616 queued and waiting for resources
srun: job 38239616 has been allocated resources
Info) VMD for CRAY_XC, version 1.9.4a58 (May 5, 2022)
Info) http://www.ks.uiuc.edu/Research/vmd/
Info) Email questions and bug reports to vmd_at_ks.uiuc.edu
Info) Please include this reference in published work using VMD:
Info) Humphrey, W., Dalke, A. and Schulten, K., `VMD - Visual
Info) Molecular Dynamics', J. Molec. Graphics 1996, 14.1, 33-38.
Info) -------------------------------------------------------------
Info) Creating CUDA device pool and initializing hardware...
Info) Initializing parallel VMD instances via MPI...
Info) Found 2 VMD MPI nodes containing a total of 48 CPUs and 2 GPUs:
Info) 0: 24 CPUs, 61.0GB (97%) free mem, 1 GPUs, Name: nid02358
Info) 1: 24 CPUs, 61.0GB (97%) free mem, 1 GPUs, Name: nid02359
Info) EGL: node[0] bound to display[0], 2 displays total
Info) EGL version 1.5
Info) OpenGL Pbuffer size: 4096x2400
Info) EGL: node[1] bound to display[1], 2 displays total
Info) OpenGL renderer: Tesla P100-PCIE-16GB/PCIe/SSE2
Info) Features: STENCIL MSAA(4) MDE CVA MTX NPOT PP PS GLSL(OVFGS)
Info) Full GLSL rendering mode is available.
Info) Textures: 2-D (32768x32768), 3-D (16384x16384x16384), Multitexture (4)
Info) Created EGL OpenGL Pbuffer for off-screen rendering
Info) Using plugin js for structure file cone-protein.js
Info) Using plugin js for coordinates from file cone-protein.js
Info) Finished with coordinate file cone-protein.js.

[...]

And this is a pure text mode (no OpenGL/EGL, just ray tracing engines):

stonej_at_daint103:~/hivmovie/vrmovie2016> srun -C gpu -p debug -n 2 --ntasks-per-node=1 /users/stonej/local/bin/vmd194a58 -dispdev text -e `pwd`/rendermovie.tcl
srun: job 38239555 queued and waiting for resources
srun: job 38239555 has been allocated resources
Info) VMD for CRAY_XC, version 1.9.4a58 (May 5, 2022)
Info) http://www.ks.uiuc.edu/Research/vmd/
Info) Email questions and bug reports to vmd_at_ks.uiuc.edu
Info) Please include this reference in published work using VMD:
Info) Humphrey, W., Dalke, A. and Schulten, K., `VMD - Visual
Info) Molecular Dynamics', J. Molec. Graphics 1996, 14.1, 33-38.
Info) -------------------------------------------------------------
Info) Creating CUDA device pool and initializing hardware...
Info) Initializing parallel VMD instances via MPI...
Info) Found 2 VMD MPI nodes containing a total of 48 CPUs and 2 GPUs:
Info) 0: 24 CPUs, 61.0GB (97%) free mem, 1 GPUs, Name: nid02358
Info) 1: 24 CPUs, 61.0GB (97%) free mem, 1 GPUs, Name: nid02359
Info) Using plugin js for structure file cone-protein.js
Info) Using plugin js for coordinates from file cone-protein.js
Info) Finished with coordinate file cone-protein.js.

[...]

Best,
  John Stone

On Wed, May 04, 2022 at 10:37:31PM -0700, Chris Taylor wrote:
> Also, as further proof that I don't know what I'm doing (hah!) I wonder why I get this. If I leave the --export option out it runs fine, but as three distinct VMDs on three nodes. With this option it fails in an intriguing but unclear fashion.
>
> $ sbatch --export=VMDOSPRAYMPI tmp.sbatch
>
> Result:
>
> $ cat slurm-107.out
> srun: error: node002: task 1: Exited with exit code 2
> srun: error: node001: task 0: Exited with exit code 2
> srun: error: node003: task 2: Exited with exit code 2
> slurmstepd: error: execve(): vmd: No such file or directory
> slurmstepd: error: execve(): vmd: No such file or directory
> slurmstepd: error: execve(): vmd: No such file or directory
>
> > On May 4, 2022 9:16 PM John Stone <johns_at_ks.uiuc.edu> wrote:
> >
> >
> > Hi Chris,
> > It looks to me like your batch system is launching VMD as a normal
> > executable rather than via the usual 'mpirun' or similar which would normally
> > be required in order for the VMD MPI initialzation to complete correctly.
> >

-- 
NIH Center for Macromolecular Modeling and Bioinformatics
Beckman Institute for Advanced Science and Technology
University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
http://www.ks.uiuc.edu/~johns/           Phone: 217-244-3349
http://www.ks.uiuc.edu/Research/vmd/