Re: Re: Is clock skew a problem for charm++

From: Gengbin Zheng (gzheng_at_ks.uiuc.edu)
Date: Wed Jun 21 2006 - 02:08:08 CDT

Hi Jan,

 Clock skew may cause misleading time output, but I doubt it is the case
here (queens program) because the time was printed from the same
processor (0).
When you run the program, did it really take 7 minutes wallclock time?
Also, have you tried pingpong test from charm/tests/charm++/pingpong to
test network latency?

Gengbin

Jan Saam wrote:

>I forgot to say that I checked already that the problem is not ssh
>taking forever to make a connection.
>This is at least proven by this simple test:
>time ssh BPU5 pwd
>/home/jan
>
>real 0m0.236s
>user 0m0.050s
>sys 0m0.000s
>
>Jan
>
>
>Jan Saam wrote:
>
>
>>Hi all,
>>
>>I'm experiencing some weird performance problems with NAMD or the
>>charm++ library on a linux cluster:
>>When I'm using NAMD or a simple charmm++ demo program on one node
>>everything is fine, but when I use more that one node each step takes
>>_very_ much longer!
>>
>>Example:
>>2s for the program queens on 1 node, 445s on 2 nodes!!!
>>
>>running
>>/home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/./pgm
>>on 1 LINUX ch_p4 processors
>>Created
>>/home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/PI28357
>>There are 14200 Solutions to 12 queens. Finish time=1.947209
>>End of program
>>[jan_at_BPU1 queens]$ mpirun -v -np 2 -machinefile ~/machines ./pgm 12 6
>>running
>>/home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/./pgm
>>on 2 LINUX ch_p4 processors
>>Created
>>/home/jan/NAMD_2.6b1_Source/charm-5.9/mpi-linux-gcc/examples/charm++/queens/PI28413
>>There are 14200 Solutions to 12 queens. Finish time=445.547998
>>End of program
>>
>>The same is true when I'm building the net-linux versions instead of
>>mpi-linux, thus the problem is probably independent of MPI.
>>
>>One thing I noticed is that there is a several minute clock skew between
>>the nodes. Could that be part of my problem (unfortnately I don't have
>>rights to simply synchronize the clocks)?
>>
>>Does anyone have an idea what the problem could be?
>>
>>Many thanks,
>>Jan
>>
>>
>>
>>
>
>
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:42:14 CST