AW: 50% system CPU usage when parallel running NAMD on Rocks cluster

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Mon Dec 16 2013 - 01:08:54 CST

Additionally, what MPI are you using, or do you use charm++?

Norman Geist.

Von: $B<~Zcz4(B [mailto:malrot13_at_gmail.com]
Gesendet: Samstag, 14. Dezember 2013 14:56
An: Norman Geist
Cc: Namd Mailing List
Betreff: Re: namd-l: 50% system CPU usage when parallel running NAMD on
Rocks cluster

I have changed my switch from 3Com Switch 2824 to IP-Com G1024(a low-end
gigabit switch brrowed from reseller). To my suprise, there is neither any
performance improvement nor deterioration. The benchmark result is totally
the same as before within a reasonable error range. I guess it's not only an
"old-switch" problem.

Any help would be appreciated!

Neil

2013/12/10 Norman Geist <norman.geist_at_uni-greifswald.de>

Jeah, something like that. I guess relative comparison is the best choice
in your case. That$B!G(Bs why I gave you a reference model. Otherwise, look for
some support from resellers.

Norman Geist.

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von ???

Gesendet: Dienstag, 10. Dezember 2013 13:52

An: Norman Geist
Cc: Namd Mailing List
Betreff: Re: namd-l: 50% system CPU usage when parallel running NAMD on
Rocks cluster

Your suggestion is very helpful. We are looking for some "better scale"
switch. But here comes another question: what feature represent the
"switching-latency" or "switching-capacity" of one switch? Is it "packet
forwarding speed" or something else? Thanks again!

Neil

2013/12/9 Norman Geist <norman.geist_at_uni-greifswald.de>

Maybe a little. There$B!G(Bs lots you can try on the software side of the
problem, but all this will only try to circumvent the real problem or lessen
the impact. The most comfortable and likely successful solution, is buying
another switch. So the keywords are switching-latency and
switching-capacity. Take the model I posted as a reference, but notice, that
16 cores per node is really heavy for 1Gbit/s Ethernet and you might want to
consider spending some money into a HPC network like Infiniband or at least
10Gbit/s Ethernet.

Norman Geist.

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von ???

Gesendet: Sonntag, 8. Dezember 2013 16:12
An: Norman Geist
Cc: Namd Mailing List

Betreff: Re: namd-l: 50% system CPU usage when parallel running NAMD on
Rocks cluster

Thanks for your reply! 16 cores per node are physical, HT was closed before
NAMD was tested. I'll consider buying a new switch.

BTW, will it scale better if I compile a UDP version NAMD?

Neil Zhou

2013/12/3 Norman Geist <norman.geist_at_uni-greifswald.de>

Your switch is too slow in switching. Try something like the netgear gs748t,
not that expensive and $B!H(Bok$B!I(B scaling. You can temporarily improve the
situation by trying the tcp congestion control algorithm $B!H(Bhighspeed$B!I(B. Set
it via sysconfig on all the nodes.

Additionally, are these 16 cores per node physical or logical (HT). If it is
HT, leave them out, no speed gain, only more network load.

Norman Geist.

  _____

 <http://www.avast.com/> Fehler! Es wurde kein Dateiname angegeben.

Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus
<http://www.avast.com/> Schutz ist aktiv.

  _____

 <http://www.avast.com/>

Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus
<http://www.avast.com/> Schutz ist aktiv.

---
Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz ist aktiv.
http://www.avast.com

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:24:06 CST