AW: namd scale-up

From: Norman Geist (
Date: Thu Sep 05 2013 - 01:32:45 CDT

Hi Revthi,


you should also have mentioned if you use an NAMD compiled against charm++
or MPI. If charm++, try "+idlepoll" to the namd2 command, it should
additionally improve scaling, sometimes two fold. Furthermore, if you have
hyperthreading or magnycores, try to use half of the cores claimed per node
and bind the processes to real physical cores only. You can use
"/proc/cpuinfo" to determine that. "processors" with same "physical id" and
"core id" usually appear to be the same physical core, these should not be
used as they are bottlenecked due memory or fpu. Using "taskset" on the
namd2 command, you can easily control which cores are allowed.




charmrun +p 64 ++nodelist nodelist taskset -c 0,2,4,6 namd2 +idlepoll


If you do not have virtual cores, forget about the above for now, but keep
in mind for the future as it has a large impact.


Additionally, it is easy to say how well a scaling is, if you just compare
the speedup to the ideal linear case. Therefore simply devide the time/step
of 1node by time /step of n nodes. This number will usually be <= n nodes.
The nearer it is to n nodes, the better. Do some benchmarks while increasing
number of nodes and keep in mind that there can be a point of outscaling,
where the time/step will start raising again. But you do not seem to hit
that case already.


So far I think there's a little more to squeeze out for 300K system doing
about 2.5ns/day.


Good luck


Norman Geist.


Von: [] Im Auftrag
von Axel Kohlmeyer
Gesendet: Mittwoch, 4. September 2013 10:01
An: Revthi Sanker
Betreff: Re: namd-l: namd scale-up




On Wed, Sep 4, 2013 at 9:43 AM, Revthi Sanker <>

Dear all,

I am running NAMD on the super cluster at my institute. My system consists
of 3 L atoms roughly.


please keep in mind that most people on this mailing list (and in the world
in general) do not know what a lakh is and better talk about 300,000 atoms
instead. what would you think if somebody would talk to you about a system
with 2000 gross atoms?


I am aware that the scale up depends on the configuration of the cluster I
am currently using. But the people at the computer center would like to get
a rough estimate of the the Benchmark (ns/day) for a system size of mine.
Anybody who is aware of the yield for this system size, please let me know
as I am not sure if what I am getting currently (2.5 ns/day for 8 nodes* 16
processors=128) is optimum or can it be tweaked further.


the only way to find out the optimum, is by doing a (strong) scaling
benchmark, i.e. use a different number of nodes and plot the resulting
speedup. the performance depends not only on the hardware (CPU
(type,generation,clock rate), memory bandwith, interconnect, BIOS
configuration (e.g. hyper-threading, turbo boost)), but also on software
(kernel, NAMD version, compiler, configuration (SMP, MPI, ibverbs)) and your
system and input. so there is no way to tell from the number of atoms in the
system and the number of nodes/cores whether you have a good performance or
a bad performance.


you can compare your numbers (absolute per cpu core performance and speedup)
to other published data from other machines (even if much older). there
should be some on the NAMD home page and in the NAMD wiki.





Thank you so much for your time :)

M.S. Research Scholar
Indian Institute Of Technology, Madras

Fehler! Es wurde kein Dateiname angegeben.Fehler! Es wurde kein Dateiname


Dr. Axel Kohlmeyer
International Centre for Theoretical Physics, Trieste. Italy. 

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:36 CST