Re: pre compiled charmm-6.8.2 for namd2.13 nightly version compilation for multiple GPU node simulations

From: Aravinda Munasinghe (aravinda1879_at_gmail.com)
Date: Tue Jan 29 2019 - 22:32:06 CST

Hi Josh,
Thank you very much for your reply. There was no specific reason for using
intel compilers. As per your suggestion, I did try without icc ( and also
with iccstatic). And still fails to run charmrun. Compilation do get
completed with

charm++ built successfully.
Next, try out a sample program like
verbs-linux-x86_64-smp/tests/charm++/simplearrayhello

But, when I try to run hello executable with charmrun I get the following
error,

Charmrun> remote shell (localhost:0) started
Charmrun> node programs all started
Charmrun remote shell(localhost.0)> remote responding...
Charmrun remote shell(localhost.0)> starting node-program...
Charmrun remote shell(localhost.0)> remote shell phase successful.
Charmrun> Waiting for 0-th client to connect.
Charmrun> error attaching to node 'localhost':
Timeout waiting for node-program to connect

This is the same error I kept getting all this time when I try to compile
it by my self. Only thing I cannot figure is how come precompiled version
works perfectly, but when I try to build from scratch it never works.
Any thoughts on this?
Best,
AM

On Tue, Jan 29, 2019 at 12:42 PM Vermaas, Joshua <Joshua.Vermaas_at_nrel.gov>
wrote:

> Hi Aravinda,
>
> Any particular reason you want to use the intel compilers? Since your goal
> is to use CUDA anyway, and the integration between the CUDA toolkit and the
> intel compilers tends to be hit or miss depending on the machine, I'd try
> the GNU compilers first (just drop the icc from the build line). If you can
> get that working, then you can spend a bit more time debugging exactly what
> your error messages mean. It could just be as simple as using iccstatic
> instead of icc, so that the libraries are bundled into the executable at
> compile time, which would solve your LD_LIBRARY_PATH issues.
>
> -Josh
>
>
>
> On 2019-01-29 09:42:41-07:00 owner-namd-l_at_ks.uiuc.edu wrote:
>
> Dear NAMD users and developers,
> I have recently attempted to compile namd2.13 nightly build to run
> multiple GPU node replica exchange simulations using REST2 methodology.
> First, I was able to run the current version of namd 2.13
> Linux-x86_64-verbs-smp-CUDA (Multi-copy algorithms on InfiniBand) binaries
> with charmrun in our university cluster using multiple node/GPU setup (with
> slurm).
> Then, I tried compiling namd 2.13 nightly version to use REST2 (since the
> current version have a bug with selecting solute atom IDs as told here -
> https://www.ks.uiuc.edu/Research/namd/mailing_list/namd-l.2018-2019/1424.html
> ), with information in NVIDIA site as well as what mentioned in the
> release note. But I failed my self miserably as several others had ( as I
> can see from the mailing thread). Since the precompiled binaries within the
> current version work perfectly, I cannot think of a reason why my attempts
> failed other than some issue related to library files and compilers I am
> loading when building charm for multiple node GPU setup. I have used
> following flags to build the charmm.
> *./build charm++ verbs-linux-x86_64 icc smp --with-production *
> I have used ifort and Intel/2018 compilers.
> One thing I have noticed is that when I use precompiled namd2.13 I did not
> have to link LD_LIBRARY_PATH. But I had to do so when I compiled it my
> self (otherwise I keep getting missing library files error).
> It would be a great help if any of you who have successfully compiled
> multiple node GPU namd 2.13 could share your charmm--6.8.2 files along with
> information on compilers you used, so I could compile namd by my self. Or
> any sort of advice on how to solve this or sharing namd2.13 precompiled
> binaries for the nightly version itself is highly appreciated.
> Thank you,
> Best,
> --
> Aravinda Munasinghe,
>
>

-- 
Aravinda Munasinghe,

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2020 - 23:17:09 CST