Is clock skew a problem for charm++

From: Jan Saam (saam_at_charite.de)
Date: Tue Jun 20 2006 - 17:37:19 CDT

Hi all,

I'm experiencing some weird performance problems with NAMD or the
charm++ library on a linux cluster:
When I'm using NAMD or a simple charmm++ demo program on one node
everything is fine, but when I use more that one node each step takes
_very_ much longer!

Example:
2s for the program queens on 1 node, 445s on 2 nodes!!!

running
/home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/./pgm
on 1 LINUX ch_p4 processors
Created
/home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/PI28357
There are 14200 Solutions to 12 queens. Finish time=1.947209
End of program
[jan_at_BPU1 queens]$ mpirun -v -np 2 -machinefile ~/machines ./pgm 12 6
running
/home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/./pgm
on 2 LINUX ch_p4 processors
Created
/home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/PI28413
There are 14200 Solutions to 12 queens. Finish time=445.547998
End of program

The same is true when I'm building the net-linux versions instead of
mpi-linux, thus the problem is probably independent of MPI.

One thing I noticed is that there is a several minute clock skew between
the nodes. Could that be part of my problem (unfortnately I don't have
rights to simply synchronize the clocks)?

Does anyone have an idea what the problem could be?

Many thanks,
Jan

-- 
---------------------------
Jan Saam
Institute of Biochemistry
Charite Berlin
Monbijoustr. 2
10117 Berlin
Germany
+49 30 450-528-446
saam_at_charite.de

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:42:14 CST