RE: output files are not stored when running namd-2.10 ibverbs

From: Thanassis Silis (djnass_18_at_hotmail.com)
Date: Fri May 08 2015 - 05:07:50 CDT

ok for anyone interested, the problem was not that the fiels would not be saved.
but they are saved on the first host found in the "nodelist" file.
for me this was

group main
host 10.200.200.21
host 10.200.200.22
host 10.200.200.23
host 10.200.200.24
host 10.200.200.25
host 10.200.200.26

and even though the controlling node (ie where I started simulation) is the 10.200.200.23, the files where saved into host 10.200.200.21 ..

So just keep that in mind.

HOST ORDER in "nodelist" file MATTERS!

From: djnass_18_at_hotmail.com
To: namd-l_at_ks.uiuc.edu
Subject: namd-l: output files are not stored when running namd-2.10 ibverbs
Date: Thu, 7 May 2015 15:29:26 +0000

I am attempting to run a simulation on a Dell Blade system across up to 6 identical systems each with 64 cores.
The simulations seem to run fine - and benefit as more blade servers are used for the simulations.
The problem is that I do not see output files being generated during the simulation from the controlling node, ie. the node through which I started the simulation and on which the log file is saved.

I use the following command to run each simulation

/usr/local/namd/charmrun /usr/local/namd/namd2 ++nodelist ./nodelist +setcpuaffinity +p128 ./test.conf > test.log &

I do not have permission to the path /usr/local/namd where the ibverbs version of the namd related executables reside, but the output files should normally be saved on the same folder that the test.conf file resides (and where I run the simulation from).

I have used the multicore version of namd to run a few simulations locally with command

./../namd-multicore/namd2 +setcputaffinity +p32 test.conf > test.log &

and this saves/generates the output files fine.

What could be the problem in the ibverbs executable case ? Is it simply not possible with the dispersed threads to write data out before the end of the simulation?

Also, I cannot connect to the running simulation through IMD. But, I presume that while there are no files saved, there is nothing to audit, right?
So no IMD listening thread is spawned. I have verified that by running netstat -paet ; it does not show anyone listening on the high port (12345) I have set with IMDport directive in the config file, even though in the log it states " IMD INTERACTIVE LISTENING PORT 12345".

Thank you in advance for your help!
                                                                                                                                             

This archive was generated by hypermail 2.1.6 : Tue Dec 27 2016 - 23:21:07 CST