Waiting for 0-th client to connect.

From: Leandro Martínez (leandromartinez98_at_gmail.com)
Date: Tue Oct 24 2006 - 12:50:07 CDT

Hi all,
I'm trying to run namd2 in a recently configured linux cluster of 8 nodes.
I can run the program in any node independently, they share the same
home directory. However, if I try to run remotely the simulation
does not start. Running with ++verbose returns me following
information. The program gets stuck on that.

The output below is from a test. The nodelist contains a single node
that is not the one where the simulation is started. If the node in the
nodelist is the node where the simulation is started the simulation
runs fine, the problem is in the remote run.

Charmrun> charmrun started...
Charmrun> using ./nodelist2 as nodesfile
Charmrun> adding client 0: "192.168.0.101", IP:192.168.0.101
Charmrun> adding client 1: "192.168.0.101", IP:192.168.0.101
Charmrun> Charmrun = alehpo.iqm.unicamp.br, port = 42645
Charmrun> Sending "0 alehpo.iqm.unicamp.br 42645 17029 0" to client 0.
Charmrun> find the node program
"/home/lmartinez/./NAMD_2.6b2_Linux-amd64/namd2" at "/home/lmartinez" for 0.
Charmrun> Starting rsh 192.168.0.101 -l lmartinez /bin/sh -f
Charmrun> rsh (192.168.0.101:0) started
Charmrun> Sending "1 alehpo.iqm.unicamp.br 42645 17029 0" to client 1.
Charmrun> find the node program
"/home/lmartinez/./NAMD_2.6b2_Linux-amd64/namd2" at "/home/lmartinez" for 1.
Charmrun> Starting rsh 192.168.0.101 -l lmartinez /bin/sh -f
Charmrun> rsh (192.168.0.101:1) started
Charmrun> node programs all started
Charmrun> waiting for rsh (192.168.0.101:0), pid 17030
Charmrun rsh(192.168.0.101.0)> remote responding...
Charmrun rsh(192.168.0.101.1)> remote responding...
Charmrun rsh(192.168.0.101.0)> starting node-program...
Charmrun rsh(192.168.0.101.0)> rsh phase successful.
Charmrun rsh(192.168.0.101.1)> starting node-program...
Charmrun rsh(192.168.0.101.1)> rsh phase successful.
Charmrun> waiting for rsh (192.168.0.101:1), pid 17031
Charmrun> Waiting for 0-th client to connect.
Timeout waiting for node-program to connect

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:42:44 CST