From: John Stone (johns_at_ks.uiuc.edu)
Date: Wed Apr 05 2023 - 23:38:27 CDT

Hi,
  I notice a couple of things that indicate to me that your SLURM
scheduler isn't correctly launching the MPI-enabled VMD.
VMD is only seeing a single MPI rank, so when it is going through
MPI_Init() and passing in the process-wide argc/argv, it either isn't
getting the right environment from the job scheduler, or the MPI-specific
part of the process launch phase of the job scheduling didn't go right.

One thing you can do to check that the basics of your 'srun' command
are otherwise correct would be to compile a very simple "hello world"
type MPI program that just prints out the number of MPI ranks and the
current node rank, and then exit, and see if that works with the same
command you're using to launch VMD.

If the simple MPI test works, then my next suggestion is to look
carefully at your VMD startup script. It might be useful to see
how your VMD startup script has been modified vs. the stock version
to validate those details while you're looking at the items above.
Can you diff your startup script vs. the stock script and show us
the output?

Best,
  John Stone
  vmd_at_ks.uiuc.edu

On Thu, Mar 30, 2023 at 09:28:18AM +0200, René Hafner TUK wrote:
> Hi,
>
>
> I am trying to get the VMD MPI rendering working.
>
> I successfully compiled VMD with MPI (mpich) and am trying to use
> srun to start the parallel render script found on the mailing list.
>
> single node execution works resulting in
>
> Info) Initializing parallel VMD instances via MPI...
> Info) Found 1 VMD MPI node containing a total of 48 CPUs and 0 GPUs:
> Info)    0:  48 CPUs, 374.3GB (99%) free mem, 0 GPUs, Name: node098
> Info) No CUDA accelerator devices available.
> ERROR) invalid command name "lmap"
>
> .. (running as intended)
>
> but using the following srun flags with 2 tasks (i.e. one per node)
>
> srun  -v -n2  --ntasks-per-node=1 -c48 vmd194a57patched7MPI -dispdev
> text -e render_movie_parallel_test.tcl
>
>
> I just get the startup message twice instead of the nodes being
> detected properly as shown below.
>
> Does anyone had this case that required more arguments to be set for
> srun to spawn this correctly?
>
>
>
> srun: defined options
> srun: -------------------- --------------------
> srun: cpus-per-task       : 48
> srun: ntasks              : 2
> srun: ntasks-per-node     : 1
> srun: verbose             : 1
> srun: -------------------- --------------------
> srun: end of defined options
> srun: Waiting for nodes to boot (delay looping 3600 times @ 0.100000
> secs x index)
> srun: Nodes node[097-098] are ready for job
> srun: jobid 443709: nodes(2):`node[097-098]', cpu counts: 48(x2)
> srun: Implicitly setting --exact, because -c/--cpus-per-task given.
> srun: launch/slurm: launch_p_step_launch: CpuBindType=(null type)
> srun: launching StepId=443709.0 on host node097, 1 tasks: 0
> srun: launching StepId=443709.0 on host node098, 1 tasks: 1
> srun: route/default: init: route default plugin loaded
> srun: launch/slurm: _task_start: Node node097, 1 tasks started
> srun: launch/slurm: _task_start: Node node098, 1 tasks started
> Info) VMD for LINUXAMD64, version 1.9.4a57 (March 10, 2023)
> Info) http://www.ks.uiuc.edu/Research/vmd/
> Info) Email questions and bug reports to vmd_at_ks.uiuc.edu
> Info) Please include this reference in published work using VMD:
> Info)    Humphrey, W., Dalke, A. and Schulten, K., `VMD - Visual
> Info)    Molecular Dynamics', J. Molec. Graphics 1996, 14.1, 33-38.
> Info) -------------------------------------------------------------
> Info) Initializing parallel VMD instances via MPI...
> Info) Found 1 VMD MPI node containing a total of 48 CPUs and 0 GPUs:
> Info)    0:  48 CPUs, 374.3GB (99%) free mem, 0 GPUs, Name: node097
> Info) No CUDA accelerator devices available.
> ERROR) invalid command name "lmap"
> Info) VMD for LINUXAMD64, version 1.9.4a57 (March 10, 2023)
> Info) http://www.ks.uiuc.edu/Research/vmd/
> Info) Email questions and bug reports to vmd_at_ks.uiuc.edu
> Info) Please include this reference in published work using VMD:
> Info)    Humphrey, W., Dalke, A. and Schulten, K., `VMD - Visual
> Info)    Molecular Dynamics', J. Molec. Graphics 1996, 14.1, 33-38.
> Info) -------------------------------------------------------------
> Info) Initializing parallel VMD instances via MPI...
> Info) Found 1 VMD MPI node containing a total of 48 CPUs and 0 GPUs:
> Info)    0:  48 CPUs, 374.3GB (99%) free mem, 0 GPUs, Name: node098
> Info) No CUDA accelerator devices available.
> ERROR) invalid command name "lmap"
> ERROR) invalid command name "lmap"
> after#0
>
>
>
> Kind regards
>
> René
>
> --
> --
> Dipl.-Phys. René Hafner
> TU Kaiserslautern
> Germany

-- 
Research Affiliate, NIH Center for Macromolecular Modeling and Bioinformatics
Beckman Institute for Advanced Science and Technology
University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
http://www.ks.uiuc.edu/~johns/           
http://www.ks.uiuc.edu/Research/vmd/