From: Giacomo Fiorin (giacomo.fiorin_at_gmail.com)
Date: Wed Apr 06 2022 - 06:56:04 CDT

Hello Lenz, it did show up:
https://www.ks.uiuc.edu/Research/vmd/mailing_list/vmd-l/33653.html
There is only a delay of several hours, as the mailing list program works
its way through the many subscribers.

Giacomo

On Wed, Apr 6, 2022 at 4:00 AM Lenz Fiedler <l.fiedler_at_hzdr.de> wrote:

> Hi John,
>
> Thanks for the clarification, that makes sense.
>
> I have tried multiple setups, but in all the cases I am using the full
> memory of a node and run only one rank (with 1 CPU) per node. So VMD
> gets 1, 2 or 4 CPUs all with the full 360GB of memory per node.
>
> I have two representations in my file: One VDW for the ~130000 atoms and
> the second one, an isosurface, for their electronic density.
> This is the tcl script I am using (I deleted some rotation commands in
> between), which I created by plotting a smaller file locally and piping
> the tcl commands into a file:
>
> menu files off
> menu files on
> display resetview
> display resetview
> mol addrep 0
> display resetview
> mol new {Be131072_density.cube} type {cube} first 0 last -1 step 1
> waitfor 1 volsets {0 }
> animate style Loop
> menu files off
> menu graphics off
> menu graphics on
> mol modstyle 0 0 VDW 1.000000 12.000000
> mol modstyle 0 0 VDW 0.900000 12.000000
> mol modstyle 0 0 VDW 0.800000 12.000000
> mol modstyle 0 0 VDW 0.700000 12.000000
> mol modstyle 0 0 VDW 0.600000 12.000000
> mol modstyle 0 0 VDW 0.500000 12.000000
> mol modstyle 0 0 VDW 0.400000 12.000000
> mol modstyle 0 0 VDW 0.300000 12.000000
> mol modstyle 0 0 VDW 0.200000 12.000000
> mol modstyle 0 0 VDW 0.100000 12.000000
> mol modmaterial 0 0 BrushedMetal
> mol modcolor 0 0 ColorID 0
> mol modcolor 0 0 ColorID 12
> mol color ColorID 12
> mol representation VDW 0.100000 12.000000
> mol selection all
> mol material BrushedMetal
> mol addrep 0
> mol modstyle 1 0 Isosurface 0.000000 0 2 2 1 1
> mol modstyle 1 0 Isosurface 0.000000 0 0 2 1 1
> mol modstyle 1 0 Isosurface 0.000000 0 0 0 1 1
> mol modmaterial 1 0 Transparent
> mol modcolor 1 0 ColorID 31
> mol modstyle 1 0 Isosurface 0.038714 0 0 0 1 1
> render TachyonInternal vmdscene.tga display %s
>
>
> If I uncomment everything after "mol addrep 0" up until the rendering,
> the file renders fine, showing only the atoms without the density.
> The file is a large Beryllium cell in slightly disordered hcp geometry.
> I would be very grateful for ideas on how to render this file!
>
> Kind regards,
> Lenz
>
> (I am resending this, because somehow it did not appear on the mailing
> list)
>
> --
> Lenz Fiedler, M. Sc.
> PhD Candidate | Matter under Extreme Conditions
>
> Tel.: +49 3581 37523 55
> E-Mail: l.fiedler_at_hzdr.de
> https://urldefense.com/v3/__https://www.casus.science__;!!DZ3fjg!uDTMTPY7cK_BwI0xrzxeD5fsW8cAvZ0lmO84fJexBOiE23WMAL69LwVTGOjFH9YnCA$
>
> CASUS - Center for Advanced Systems Understanding
> Helmholtz-Zentrum Dresden-Rossendorf e.V. (HZDR)
> Untermarkt 20
> 02826 Görlitz
>
> Vorstand: Prof. Dr. Sebastian M. Schmidt, Dr. Diana Stiller
> Vereinsregister: VR 1693 beim Amtsgericht Dresden
>
> On 4/4/22 18:37, John Stone wrote:
> > Hi,
> > Right, using the non-MPI Tachyon within VMD is correct.
> > It will result in the individual MPI ranks doing their own Tachyon
> > renderings, which is the right thing for most typical VMD+MPI
> > workloads like movie renderings.
> >
> > If you're running out of node memory, there are a few ways
> > we might "tame" the memory use in VMD/Tachyon for your cube file
> > scenario. The 9GB cube file doesn't sound like it should result
> > in a scene that would create a huge memory footprint. Are you running
> > multiple VMD MPI ranks on the same machine still? If so, then I would
> > begin by avoiding that, so that each MPI process gets the full node
> > memory.
> >
> > Regarding the rendering of the cube file, what representations are you
> > using? Just isosurface, or do you have lots of other representations
> > as well? Is there any other molecular geometry?
> >
> > I might have suggestions for you to try the reduce that memory footprint
> > assuming you've already switched to running only one MPI rank per node.
> >
> > Best,
> > John Stone
> >
> >
> > On Mon, Apr 04, 2022 at 05:45:52PM +0200, Lenz Fiedler wrote:
> >> Hi John,
> >>
> >>
> >> Thank you so much - the error was indeed from the tachyon MPI
> >> version! It was just as you described, I had compiled the MPI
> >> version for both VMD and tachyon. After using the serial version for
> >> the latter, I don't get the crash anymore! :)
> >>
> >> Does this mean then that the rendering will be done in serial only
> >> on rank 0? I am trying to render an image based on a very large
> >> (9GB) .cube file (with isosurface), and so far using either 1, 2 and
> >> 4 nodes with 360GB shared memory have resulted in a segmentation
> >> fault. I assume it is memory related, because I can render smaller
> >> files just fine.
> >>
> >>
> >> Also thanks for the info regarding the threading, I will keep that in
> mind!
> >>
> >>
> >> Kind regards,
> >>
> >> Lenz
> >>
> >>
> >> --
> >> Lenz Fiedler, M. Sc.
> >> PhD Candidate | Matter under Extreme Conditions
> >>
> >> Tel.: +49 3581 37523 55
> >> E-Mail: l.fiedler_at_hzdr.de
> >> https://urldefense.com/v3/__https://www.casus.science__;!!DZ3fjg!uDTMTPY7cK_BwI0xrzxeD5fsW8cAvZ0lmO84fJexBOiE23WMAL69LwVTGOjFH9YnCA$
> >>
> >> CASUS - Center for Advanced Systems Understanding
> >> Helmholtz-Zentrum Dresden-Rossendorf e.V. (HZDR)
> >> Untermarkt 20
> >> 02826 Görlitz
> >>
> >> Vorstand: Prof. Dr. Sebastian M. Schmidt, Dr. Diana Stiller
> >> Vereinsregister: VR 1693 beim Amtsgericht Dresden
> >>
> >> On 4/4/22 17:04, John Stone wrote:
> >>> Hi,
> >>> The MPI bindings for VMD are really intended for multi-node runs
> >>> rather than for dividing up the CPUs within a single node. The output
> >>> you're seeing shows that VMD is counting 48 CPUs (hyperthreading, no
> doubt)
> >>> for each MPI rank, even though they're all being launched on the same
> node.
> >>> The existing VMD startup code doesn't automatically determine when
> sharing
> >>> like this occurs, so it's just behaving the same way it would if you
> had
> >>> launched the job on 8 completely separate cluster nodes. You can set
> some
> >>> environment variables to restrict the number of shared memory threads
> >>> VMD/Tachyon use if you really want to run all of your ranks on the
> same node.
> >>>
> >>> The warning you're getting from OpenMPI about multiple initialization
> >>> is interesting. When you compiled VMD, you didn't compile both VMD
> >>> and the built-in Tachyon with MPI enabled did you? If Tachyon is also
> >>> trying to call MPI_Init() or MPI_Init_Thread() that might explain
> >>> that particular error message. Have a look at that and make sure
> >>> that (for now at least) you're not compiling the built-in Tachyon
> >>> with MPI turned on, and let's see if we can rid you of the
> >>> OpenMPI initialization errors+warnings.
> >>>
> >>> Best,
> >>> John Stone
> >>> vmd_at_ks.uiuc.edu
> >>>
> >>> On Mon, Apr 04, 2022 at 04:39:17PM +0200, Lenz Fiedler wrote:
> >>>> Dear VMD users and developers,
> >>>>
> >>>>
> >>>> I am facing a problem in running VMD using MPI.
> >>>>
> >>>> I compiled VMD from source (alongside Tachyon, which I would like to
> >>>> use for rendering). I had first checked everything in serial, there
> >>>> it worked. Now, after parallel compilation, I struggle to run VMD.
> >>>>
> >>>> E.g. I am allocating 8 CPUs on a cluster node that has 24 CPUs in
> >>>> total. Afterwards, I am trying to do:
> >>>>
> >>>> mpirun -np 8 vmd
> >>>>
> >>>> and I get this output:
> >>>>
> >>>> Info) VMD for LINUXAMD64, version 1.9.3 (April 4, 2022)
> >>>> Info) http://www.ks.uiuc.edu/Research/vmd/
> >>>> Info) Email questions and bug reports to vmd_at_ks.uiuc.edu
> >>>> Info) Please include this reference in published work using VMD:
> >>>> Info) Humphrey, W., Dalke, A. and Schulten, K., `VMD - Visual
> >>>> Info) Molecular Dynamics', J. Molec. Graphics 1996, 14.1, 33-38.
> >>>> Info) -------------------------------------------------------------
> >>>> Info) Initializing parallel VMD instances via MPI...
> >>>> Info) Found 8 VMD MPI nodes containing a total of 384 CPUs and 0 GPUs:
> >>>> Info) 0: 48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name:
> gv002.cluster
> >>>> Info) 1: 48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name:
> gv002.cluster
> >>>> Info) 2: 48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name:
> gv002.cluster
> >>>> Info) 3: 48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name:
> gv002.cluster
> >>>> Info) 4: 48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name:
> gv002.cluster
> >>>> Info) 5: 48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name:
> gv002.cluster
> >>>> Info) 6: 48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name:
> gv002.cluster
> >>>> Info) 7: 48 CPUs, 324.9GB (86%) free mem, 0 GPUs, Name:
> gv002.cluster
> >>>>
> --------------------------------------------------------------------------
> >>>> Open MPI has detected that this process has attempted to initialize
> >>>> MPI (via MPI_INIT or MPI_INIT_THREAD) more than once. This is
> >>>> erroneous.
> >>>>
> --------------------------------------------------------------------------
> >>>> [gv002:139339] *** An error occurred in MPI_Init
> >>>> [gv002:139339] *** reported by process [530644993,2]
> >>>> [gv002:139339] *** on a NULL communicator
> >>>> [gv002:139339] *** Unknown error
> >>>> [gv002:139339] *** MPI_ERRORS_ARE_FATAL (processes in this
> >>>> communicator will now abort,
> >>>> [gv002:139339] *** and potentially your MPI job)
> >>>>
> >>>>
> >>>> From the output it seems to me that each of the 8 MPI ranks assumes
> >>>> it is rank zero? At least the fact that each rank gives 48 CPUs
> >>>> (24*2 I assume?) makes me believe that.
> >>>>
> >>>> Could anyone give me a hint on what I might be doing wrong? The
> >>>> OpenMPI installation I am using has been used for many other
> >>>> programs on this cluster, so I would assume it is working correctly.
> >>>>
> >>>>
> >>>> Kind regards,
> >>>>
> >>>> Lenz
> >>>>
> >>>> --
> >>>> Lenz Fiedler, M. Sc.
> >>>> PhD Candidate | Matter under Extreme Conditions
> >>>>
> >>>> Tel.: +49 3581 37523 55
> >>>> E-Mail: l.fiedler_at_hzdr.de
> >>>> https://urldefense.com/v3/__https://www.casus.science__;!!DZ3fjg!uDTMTPY7cK_BwI0xrzxeD5fsW8cAvZ0lmO84fJexBOiE23WMAL69LwVTGOjFH9YnCA$
> >>>>
> >>>> CASUS - Center for Advanced Systems Understanding
> >>>> Helmholtz-Zentrum Dresden-Rossendorf e.V. (HZDR)
> >>>> Untermarkt 20
> >>>> 02826 Görlitz
> >>>>
> >>>> Vorstand: Prof. Dr. Sebastian M. Schmidt, Dr. Diana Stiller
> >>>> Vereinsregister: VR 1693 beim Amtsgericht Dresden
> >>>>
> >>>>
> >>>
> >
> >
>
>