From: Rene Salmon (rsalmon_at_tulane.edu)
Date: Fri Dec 31 2004 - 10:08:21 CST
Hi,
Thanks for the reply. So this morning I downloaded the latest charm from
the website and compiled it with
> ./build charm++ net-linux-amd64 clustermatic
Everything compiled fine with no errors and I got a "build successfully"
at the end.
I am very new to charm and clustermatic so I am not sure I am doing this
right when I try to test it.
> Make sure charm/tests/charm++/megatest works before you proceed to NAMD.
All the test programs compile without problems. I am just not sure how to
run them to test charm. So I tried a small one first in
examples/charm++/queens
I am running this on a small test cluster, one master, node and one slave
node. Here is the bpstat on the cluster:
>bpstat
Node(s) Status Mode User
1 down ---------- root
0 up ---x--x--x root
so you see there is only one slave node up. Here is what happens when I
try to run the test.
queens# ./charmrun ++skipmaster ++verbose ++startpe 0 +p2 ./pgm 12 6
Charmrun> charmrun started...
Charmrun> node -1 status: up
Charmrun> node 0 status: up
Charmrun> adding client 0: "0", IP:10.0.0.3
Charmrun> node 1 status: down
Charmrun> node 2 status: down
Charmrun> There are 1 slave nodes available.
Charmrun> adding client 1: "0", IP:10.0.0.3
Charmrun> Charmrun = 10.0.0.2, port = 33089
Charmrun> start node program on slave node: 0.
Charmrun> start node program on slave node: 0.
Charmrun> node programs all started
Charmrun> Waiting for 0-th client to connect.
Charmrun> error 0 attaching to node:
Timeout waiting for node-program to connect
I also tried with "++singlemaster" and got this:
queens# ./charmrun ++singlemaster ++verbose ++startpe 0 +p2 ./pgm 12 6
Charmrun> charmrun started...
Charmrun> node -1 status: up
Charmrun> adding client 0: "-1", IP:10.0.0.2
Charmrun> node 0 status: up
Charmrun> adding client 1: "0", IP:10.0.0.3
Charmrun> There are 1 slave nodes available.
Charmrun> Charmrun = 10.0.0.2, port = 33092
Charmrun> start node program on slave node: -1.
Charmrun> start node program on slave node: 0.
Charmrun> node programs all started
Charmrun> Waiting for 0-th client to connect.
Charmrun> client 0 connected (IP=10.0.0.2 data_port=32817)
Charmrun> Waiting for 1-th client to connect.
Charmrun> error 1 attaching to node:
Timeout waiting for node-program to connect
Any clues as to what I am doing wrong?
Thank you in advance for any help
Rene
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:39:05 CST