AW: Ib version and nodelist

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Mon Aug 22 2011 - 00:50:56 CDT

Hi,
 
++remote-shell requires a parameter, so the whole parameters of your call
will shift and get false. If you need to set the remote shell, try
 
[.] +remote-shell ssh [.]
 
for example.
 
Best wishes
 
Norman Geist.
 
Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von Neelanjana Sengupta
Gesendet: Montag, 22. August 2011 07:36
An: NAMD
Betreff: namd-l: Ib version and nodelist
 
Dear NAMD experts,

We are attempting to run the Infiniband version of NAMD2.8
(Linux-x86_64-ibverbs
<http://www.ks.uiuc.edu/Development/Download/download.cgi?UserID=&AccessCode
=&ArchiveID=1159> ) on an Infiniband cluster in which each node contains 12
processors. The compute nodes are sequentially named as cn001 through cn056
(as seen with the cmd pbsnodes -a).
Our .nodelist file looks like this:

group main
host cn001
host cn002
..
..
host cn054
host cn055
host cn056

We tried running jobs with this command in the submit script:
/soft/NAMD_2.8_Linux-x86_64-ibverbs/charmrun ++ppn 12 ++p 12 ++remote-shell
/soft/NAMD_2.8_Linux-x86_64-ibverbs/namd2 job.inp

However, we get errors like this:

Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 60303 for net-linux-x86_64-ibverbs-iccstatic
Info: Built Sat May 28 11:31:19 CDT 2011 by jim on dakar.ks.uiuc.edu
Charm++: standalone mode (not using charmrun)
Warning> RandomizCharm++: standalone mode (not using charmrun)
Warning> Randomization of stack pointer is turned on in kernel, thread
migration may not work! Run 'echo 0 > /proc/sys/kernel/Info: 50.0469 MB of
memory in use based on /proc/self/stat
Info: Configuration file is cn054
FATAL ERROR: Unable to access config file cn054
[0] Stack Traceback:
  [0:0] CmiAbort+0x5c [0xbf56fa]
  [0:1] _Z8NAMD_diePKc+0x62 [0x535482]
  [0:2] _Z18after_backend_initiPPc+0x3d0 [0x539b90]
  [0:3] main+0x3a [0x53978a]
  [0:4] __libc_start_main+0xf4 [0x393301d994]
  [0:5] _ZNSt8ios_base4InitD1Ev+0x52 [0x534d7a]
[0] Stack Traceback:
  [0:0] /soft/nsengupta/NAMD_2.8_Linux-x86_64-ibverbs/namd2 [0xbf55b6]
  [0:1] CmiAbort+0x8e [0xbf572c]
  [0:2] _Z8NAMD_diePKc+0x62 [0x535482]
  [0:3] _Z18after_backend_initiPPc+0x3d0 [0x539b90]
  [0:4] main+0x3a [0x53978a]
  [0:5] __libc_start_main+0xf4 Info: Running on 1 processors, 1 nodes, 1
physical nodes.

etc.

charmrun is apparently mis-interpreting the nodefile. Can we please get some
ideas as to how to solve this problem?

Thanks and regards,
Neelanjana Sengupta

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:20:44 CST