Re: timeout on running namd

From: Gengbin Zheng (gzheng_at_ks.uiuc.edu)
Date: Tue Mar 30 2004 - 13:23:15 CST

I guess with the partial debug print, we can not figure out what is the
exact reason.
But most likely is that the charmrun machine (your frontend machine)
failed to send its correct IP to the compute node (node1 and 5).

Gengbin

On Tue, 30 Mar 2004, Amarda Shehu wrote:

>
> Dear all,
>
> I am lately having problems firing up namd on 2 nodes (4 processors). This
> has never happened before. Here are the complaints of charmrun on debug
> mode. The nodes 01 and 05 that I am using are up - I amusing a cluster in
> my university - there's nothing wrong with them.
>
> Charmrun> charmrun started...
> Charmrun> using /home/shehua/.nodelist as nodesfile
> Charmrun> rsh (node01:0d) started
> Charmrun> rsh (node05:1d) started
> Charmrun> rsh (node01:2d) started
> Charmrun> rsh (node05:3d) started
> Charmrun> node programs all started
> Charmrun> error 0 attaching to node:
> Timeout waiting for node-program to connect
>
>
> I wwould appreciate if someone knows why this is happening. It just
> started yesterday everning - I was able to run namd all day before this
> started to happen. Nothing changed except my input files to the
> configuration file. The cluster is up and running.
>
> -Amarda
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:38:32 CST