AW: 50% system CPU usage when parallel running NAMD on Rocks cluster

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Mon Dec 09 2013 - 01:19:28 CST

Maybe a little. There$B!G(Bs lots you can try on the software side of the
problem, but all this will only try to circumvent the real problem or lessen
the impact. The most comfortable and likely successful solution, is buying
another switch. So the keywords are switching-latency and
switching-capacity. Take the model I posted as a reference, but notice, that
16 cores per node is really heavy for 1Gbit/s Ethernet and you might want to
consider spending some money into a HPC network like Infiniband or at least
10Gbit/s Ethernet.

Norman Geist.

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von ???
Gesendet: Sonntag, 8. Dezember 2013 16:12
An: Norman Geist
Cc: Namd Mailing List
Betreff: Re: namd-l: 50% system CPU usage when parallel running NAMD on
Rocks cluster

Thanks for your reply! 16 cores per node are physical, HT was closed before
NAMD was tested. I'll consider buying a new switch.

BTW, will it scale better if I compile a UDP version NAMD?

Neil Zhou

2013/12/3 Norman Geist <norman.geist_at_uni-greifswald.de>

Your switch is too slow in switching. Try something like the netgear gs748t,
not that expensive and $B!H(Bok$B!I(B scaling. You can temporarily improve the
situation by trying the tcp congestion control algorithm $B!H(Bhighspeed$B!I(B. Set
it via sysconfig on all the nodes.

Additionally, are these 16 cores per node physical or logical (HT). If it is
HT, leave them out, no speed gain, only more network load.

Norman Geist.

---
Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz ist aktiv.
http://www.avast.com

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:58 CST