AW: [SOLVED] Charmrun> error x attaching to node

From: Norman Geist (
Date: Tue Apr 30 2013 - 07:09:22 CDT

What I forgot to mention:


This solution should solve the problem with


Charmrun> error 0 attaching to node


followed by


Timeout waiting for node-program to connect


not followed by


Socket closed before recv.


which usually is a ibverbs problem with charmrun and can be solved by using
an MPI like OpenMPI.


Norman Geist.


Von: [] Im Auftrag
von Norman Geist
Gesendet: Dienstag, 30. April 2013 12:09
An: Namd Mailing List
Betreff: namd-l: [SOLVED] Charmrun> error x attaching to node


Hello NAMD users,


as a hint for all people having the mentioned problem while running NAMD in
parallel across multiple nodes :


Charmrun> error 0 attaching to node


with the same or other numbers for error, because there's no solution to
find out there so far and it is driving one nuts, I decided to tell you what
the most likely problem with your network configuration is. Very likely your
local DNS configuration from "/etc/hosts" on the compute nodes contains an
entry that resolves the hostname of the compute node to a loopback
interface. This often looks like: hostname

or hostname


You can check this while doing a ping to the hostname, while you are logged
in at a compute node "ping hostname". If this returns an 127.x.x.x address,
your local DNS configuration is not suitable for charmrun as for charmrun
it's important, that the hostname resolves to an outgoing IP address, best
choice should be the network you want to use for the computation
communication. Otherwise, the node will not be able to connect to the other
nodes, as it is caught within the internal loopback network. This is also
important for using IBverbs as charmrun needs to resolve the IPoIB IP
address to the real Infiniband HCA.

I hope this saves you spending a lot of time googling around without finding
a solution.


Good luck


Norman Geist


PS: Other errors can be, that NAMD is not installed on a shared drive and
has a different path on the compute nodes, ++verbose for charmrun should
point out then.

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:23:11 CST