RE: Running NAMD on Linux Cluster

From: Vermaas, Joshua (Joshua.Vermaas_at_nrel.gov)
Date: Wed Feb 27 2019 - 12:26:52 CST

Ok, if you are only going to be running on 1 node at a time, I'd recommend starting with the multicore-CUDA version. Its the simplest GPU-supporting version to setup and run:

/path/to/namd/namd2 +p8 configfile.namd > logfile.log

The ibverbs smp builds are essential for running across multiple nodes, but from the low processor counts in your email, I don't think that is the situration you are setting up.

-Josh


On 2019-02-27 06:15:39-07:00 owner-namd-l_at_ks.uiuc.edu wrote:


---------- Àü´ÞµÈ ¸ÞÀÏ ----------
º¸³½»ç¶÷: ±è¹ÎÀç <kjh950429_at_gmail.com<mailto:kjh950429_at_gmail.com>>
³¯Â¥: 2019³â 2¿ù 27ÀÏ (¼ö) ¿ÀÀü 10:18
Á¦¸ñ: Re: namd-l: Running NAMD on Linux Cluster
¹Þ´Â»ç¶÷: Aravinda Munasinghe <aravinda1879_at_gmail.com<mailto:aravinda1879_at_gmail.com>>

Hi
I¡¯m new to namd, so I am using a precompiled namd binary that I downloaded from the namd website. To be more specific, I used the ¡°Linux-x86_64-ibverbs-smp-CUDA¡± package (I¡¯m trying to make use NVIDIA GPU acceleration.)
Thanks

2019³â 2¿ù 27ÀÏ (¼ö) ¿ÀÀü 5:06, Aravinda Munasinghe <aravinda1879_at_gmail.com<mailto:aravinda1879_at_gmail.com>>´ÔÀÌ ÀÛ¼º:
Hi,
If you compiled charm from scratch as well, I recommend you to see if it was compiled properly. (Try charm hello world). If it didn't work, most probably from the error I see, your charm did not execute properly. What flags did you use when compiling charm?
Best,
Aravinda Munasinghe
On Tue, Feb 26, 2019 at 10:24 AM ±è¹ÎÀç <kjh950429_at_gmail.com<mailto:kjh950429_at_gmail.com>> wrote:
Hi
I have been facing several problems while trying to run a MD simulation through a supercomputer. I have read through the namd user guide and followed their instructions. So, I have been writing bash scripts to run namd. Following the userguide I wrote script called 'mympiexec':
 #!/bin/csh
shift; shift; exec ibrun $*
Then, I wrote a script called 'runme': (NAMD is the directory that holds namd2 and mympiexec)
cd NAMD
./charmrun +p8 ++mpiexec ++remote-shell ./mympiexec ./namd2 ./1ca2_eq.conf
However, I got the following error message(testy):
Warning: Permanently added '[c12]:22554,[192.168.0.112]:22554' (ECDSA) to the list of known hosts.
ibrun: Command not found.
ibrun: Command not found.
Charmrun> error attaching to node '127.0.0.1':
Timeout waiting for node-program to connect
(I attached the other message to this email)
I also tried another method advised in a html guide to NAMD. I wrote 'runscript':
#!/bin/csh
setenv LD_LIBRARY_PATH "${1:h}:$LD_LIBRARY_PATH"
$*
 And then, I wrote and ran the following script 'runCUDA':
#!/bin/csh
cd NAMD
./charmrun ++runscript ./runscript +p8 ./namd2 +idlepoll ++ppn 1 ./1ca2_eq
In this case I got the following error message (testx):
Warning: Permanently added '[c38]:22554,[192.168.0.138]:22554' (ECDSA) to the list of known hosts.
ssh_exchange_identification: read: Connection reset by peer
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Reconnection attempt 1 of 3
ssh_exchange_identification: read: Connection reset by peer
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Reconnection attempt 1 of 3
ssh_exchange_identification: read: Connection reset by peer
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Reconnection attempt 2 of 3
ssh_exchange_identification: read: Connection reset by peer
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Reconnection attempt 2 of 3
ssh_exchange_identification: read: Connection reset by peer
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Reconnection attempt 3 of 3
ssh_exchange_identification: read: Connection reset by peer
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Reconnection attempt 3 of 3
ssh_exchange_identification: read: Connection reset by peer
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Too many reconnection attempts; bailing out
ssh_exchange_identification: read: Connection reset by peer
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Too many reconnection attempts; bailing out
I am new to namd and I would really appreciate help. Thanks

--
Aravinda Munasinghe,

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2020 - 23:17:10 CST