AW: 50% system CPU usage when parallel running NAMD on Rocks cluster

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Tue Dec 10 2013 - 08:19:45 CST

Jeah, something like that. I guess relative comparison is the best choice
in your case. That$B!G(Bs why I gave you a reference model. Otherwise, look for
some support from resellers.

Norman Geist.

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von ???
Gesendet: Dienstag, 10. Dezember 2013 13:52
An: Norman Geist
Cc: Namd Mailing List
Betreff: Re: namd-l: 50% system CPU usage when parallel running NAMD on
Rocks cluster

Your suggestion is very helpful. We are looking for some "better scale"
switch. But here comes another question: what feature represent the
"switching-latency" or "switching-capacity" of one switch? Is it "packet
forwarding speed" or something else? Thanks again!

Neil

2013/12/9 Norman Geist <norman.geist_at_uni-greifswald.de>

Maybe a little. There$B!G(Bs lots you can try on the software side of the
problem, but all this will only try to circumvent the real problem or lessen
the impact. The most comfortable and likely successful solution, is buying
another switch. So the keywords are switching-latency and
switching-capacity. Take the model I posted as a reference, but notice, that
16 cores per node is really heavy for 1Gbit/s Ethernet and you might want to
consider spending some money into a HPC network like Infiniband or at least
10Gbit/s Ethernet.

Norman Geist.

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von ???

Gesendet: Sonntag, 8. Dezember 2013 16:12
An: Norman Geist
Cc: Namd Mailing List

Betreff: Re: namd-l: 50% system CPU usage when parallel running NAMD on
Rocks cluster

Thanks for your reply! 16 cores per node are physical, HT was closed before
NAMD was tested. I'll consider buying a new switch.

BTW, will it scale better if I compile a UDP version NAMD?

Neil Zhou

2013/12/3 Norman Geist <norman.geist_at_uni-greifswald.de>

Your switch is too slow in switching. Try something like the netgear gs748t,
not that expensive and $B!H(Bok$B!I(B scaling. You can temporarily improve the
situation by trying the tcp congestion control algorithm $B!H(Bhighspeed$B!I(B. Set
it via sysconfig on all the nodes.

Additionally, are these 16 cores per node physical or logical (HT). If it is
HT, leave them out, no speed gain, only more network load.

Norman Geist.

  _____

 <http://www.avast.com/> Fehler! Es wurde kein Dateiname angegeben.

Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus
<http://www.avast.com/> Schutz ist aktiv.

---
Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz ist aktiv.
http://www.avast.com

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:24:04 CST