From: Cesar Luis Avila (cavila_at_fbqf.unt.edu.ar)
Date: Thu Mar 08 2007 - 07:32:56 CST
What messages do you get upon hanging?
paco ty escribió:
> Hi fellows,
> I need some advice from friends with cluster experience.
> I programmed a 500000 steps simulation of a 18000 atom system
> in order to test a 2-nodes cluster. Short simulations run
> perfectly with good scaling, but the 500000 simulation stoped
> prematurely at step 135300. The non-TCP version stops at the first
> or second step.
> Another thing that makes me warry about it is that my expensive
> 3COM gigabit switch does not perform better than its small cheap
> brother at 10/100. According to the led color of the switch, it
> seems that my network adapter works at 1000. I don't know how to
> test the rate alternatively.
> I run namd2 with "./charmrun namd2 +p2 configfile.txt > logfile.txt"
> Here is my machine configuration:
> OS: Scientific Linux 4.4 i386 (kernel 2.6.9-42.0.3-ELsmp)
> Software: NAMD2_2.6_Linux-i686-TCP (precompiled)
> rsh as root works with no password
> CPU: Intel(R) Pentium(R) 4 CPU 3.8 GHz
> RAM: 1 G
> Network adapter: Linksys EG1032 (10/100/1000)
> I attach the first part of the logfile.
> Well, does anybody knows the reason for such a behaviour?
> Thank you in advance
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:44:27 CST