From: John Stone (johns_at_ks.uiuc.edu)
Date: Mon Apr 04 2022 - 10:04:05 CDT

Hi,
  The MPI bindings for VMD are really intended for multi-node runs
rather than for dividing up the CPUs within a single node. The output
you're seeing shows that VMD is counting 48 CPUs (hyperthreading, no doubt)
for each MPI rank, even though they're all being launched on the same node.
The existing VMD startup code doesn't automatically determine when sharing
like this occurs, so it's just behaving the same way it would if you had
launched the job on 8 completely separate cluster nodes. You can set some
environment variables to restrict the number of shared memory threads
VMD/Tachyon use if you really want to run all of your ranks on the same node.

The warning you're getting from OpenMPI about multiple initialization
is interesting. When you compiled VMD, you didn't compile both VMD
and the built-in Tachyon with MPI enabled did you? If Tachyon is also
trying to call MPI_Init() or MPI_Init_Thread() that might explain
that particular error message. Have a look at that and make sure
that (for now at least) you're not compiling the built-in Tachyon
with MPI turned on, and let's see if we can rid you of the
OpenMPI initialization errors+warnings.

Best,
  John Stone
  vmd_at_ks.uiuc.edu

On Mon, Apr 04, 2022 at 04:39:17PM +0200, Lenz Fiedler wrote:
> Dear VMD users and developers,
>
>
> I am facing a problem in running VMD using MPI.
>
> I compiled VMD from source (alongside Tachyon, which I would like to
> use for rendering). I had first checked everything in serial, there
> it worked. Now, after parallel compilation, I struggle to run VMD.
>
> E.g. I am allocating 8 CPUs on a cluster node that has 24 CPUs in
> total. Afterwards, I am trying to do:
>
> mpirun -np 8 vmd
>
> and I get this output:
>
> Info) VMD for LINUXAMD64, version 1.9.3 (April 4, 2022)
> Info) http://www.ks.uiuc.edu/Research/vmd/
> Info) Email questions and bug reports to vmd_at_ks.uiuc.edu
> Info) Please include this reference in published work using VMD:
> Info)    Humphrey, W., Dalke, A. and Schulten, K., `VMD - Visual
> Info)    Molecular Dynamics', J. Molec. Graphics 1996, 14.1, 33-38.
> Info) -------------------------------------------------------------
> Info) Initializing parallel VMD instances via MPI...
> Info) Found 8 VMD MPI nodes containing a total of 384 CPUs and 0 GPUs:
> Info)    0:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
> Info)    1:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
> Info)    2:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
> Info)    3:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
> Info)    4:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
> Info)    5:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
> Info)    6:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
> Info)    7:  48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name: gv002.cluster
> --------------------------------------------------------------------------
> Open MPI has detected that this process has attempted to initialize
> MPI (via MPI_INIT or MPI_INIT_THREAD) more than once.  This is
> erroneous.
> --------------------------------------------------------------------------
> [gv002:139339] *** An error occurred in MPI_Init
> [gv002:139339] *** reported by process [530644993,2]
> [gv002:139339] *** on a NULL communicator
> [gv002:139339] *** Unknown error
> [gv002:139339] *** MPI_ERRORS_ARE_FATAL (processes in this
> communicator will now abort,
> [gv002:139339] ***    and potentially your MPI job)
>
>
> From the output it seems to me that each of the 8 MPI ranks assumes
> it is rank zero? At least the fact that each rank gives 48 CPUs
> (24*2 I assume?) makes me believe that.
>
> Could anyone give me a hint on what I might be doing wrong? The
> OpenMPI installation I am using has been used for many other
> programs on this cluster, so I would assume it is working correctly.
>
>
> Kind regards,
>
> Lenz
>
> --
> Lenz Fiedler, M. Sc.
> PhD Candidate | Matter under Extreme Conditions
>
> Tel.: +49 3581 37523 55
> E-Mail: l.fiedler_at_hzdr.de
> https://www.casus.science
>
> CASUS - Center for Advanced Systems Understanding
> Helmholtz-Zentrum Dresden-Rossendorf e.V. (HZDR)
> Untermarkt 20
> 02826 Görlitz
>
> Vorstand: Prof. Dr. Sebastian M. Schmidt, Dr. Diana Stiller
> Vereinsregister: VR 1693 beim Amtsgericht Dresden
>
>

-- 
NIH Center for Macromolecular Modeling and Bioinformatics
Beckman Institute for Advanced Science and Technology
University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
http://www.ks.uiuc.edu/~johns/           Phone: 217-244-3349
http://www.ks.uiuc.edu/Research/vmd/