Re: Is clock skew a problem for charm++

From: Jan Saam (saam_at_charite.de)
Date: Tue Jun 20 2006 - 17:50:46 CDT

I forgot to say that I checked already that the problem is not ssh
taking forever to make a connection.
This is at least proven by this simple test:
time ssh BPU5 pwd
/home/jan

real 0m0.236s
user 0m0.050s
sys 0m0.000s

Jan

Jan Saam wrote:
> Hi all,
>
> I'm experiencing some weird performance problems with NAMD or the
> charm++ library on a linux cluster:
> When I'm using NAMD or a simple charmm++ demo program on one node
> everything is fine, but when I use more that one node each step takes
> _very_ much longer!
>
> Example:
> 2s for the program queens on 1 node, 445s on 2 nodes!!!
>
> running
> /home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/./pgm
> on 1 LINUX ch_p4 processors
> Created
> /home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/PI28357
> There are 14200 Solutions to 12 queens. Finish time=1.947209
> End of program
> [jan_at_BPU1 queens]$ mpirun -v -np 2 -machinefile ~/machines ./pgm 12 6
> running
> /home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/./pgm
> on 2 LINUX ch_p4 processors
> Created
> /home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/PI28413
> There are 14200 Solutions to 12 queens. Finish time=445.547998
> End of program
>
> The same is true when I'm building the net-linux versions instead of
> mpi-linux, thus the problem is probably independent of MPI.
>
> One thing I noticed is that there is a several minute clock skew between
> the nodes. Could that be part of my problem (unfortnately I don't have
> rights to simply synchronize the clocks)?
>
> Does anyone have an idea what the problem could be?
>
> Many thanks,
> Jan
>
>

-- 
---------------------------
Jan Saam
Institute of Biochemistry
Charite Berlin
Monbijoustr. 2
10117 Berlin
Germany
+49 30 450-528-446
saam_at_charite.de

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:42:14 CST