From: Thomas C. Bishop (bishop_at_latech.edu)
Date: Tue Apr 28 2015 - 15:53:42 CDT
i (thought) namd2.10 did better for me... but that may be system specific ... I'm using amber prm/tops and have pro/dna + water and ions in my system.Hello Tom
Interesting data. I have never been able to make namd2.10 run any faster than namd2.9 on a non-cuda cluster at LLNL.
Two 10-core 2.8 GHz E5-2680v2 Xeon processors per node (or 20cores/node)I am a little confused by the nomenclature in the text below the graph. You mention having 20 cores/node.
yes I leave 1 core each cpu inactive so 2 cores per nodeAfter that you wrote that you left one core per CPU inactive. Do you mean 1 core per node? Is there one process per core?
This is the number of Service Units (SU = cpu hrs) I must request on this shared resource to run 1 ns of my simulation.Finally, can you explain wht 432SU/ns means.
Thanks again for posting the data.
From: email@example.com [firstname.lastname@example.org] on behalf of Thomas C. Bishop [email@example.com]
Sent: Tuesday, April 28, 2015 12:32 PM
Subject: Re: namd-l: performance question
The extra options were admittedly just there as I tried to tweak things and convince myself namd was really doing what I thought it was and performing optimally.
I always conduct a set of benchmark runs of my particular system before starting my studies. something Jim Phillips suggested long ago, even as you consider a potential cluster purchase.
CUDA does in fact give me significant speedups but as noted in my original message the CPU & GPU utilization seem rather low , consistently < 30%
I thought I'd post this pdf and get some feedback or discussion about how best to decide
"have I tweaked the performance enough?"
Of course it's hard to compare MY simulations to YOURs but hopefully others can offer general comments that may be of use to all namd users.
For my production runs I get more simulation for my allotted SU when using ~ 180 core.
This splits the difference between efficiency and throughput, but I don't have a real metric or objective criteria for making this decision.
On 04/28/2015 02:51 AM, Norman Geist wrote:
So you should benchmark for various node counts and have a look on the speedup, relative
to one node.
I’ll give some hints on what to try:
(forget about +ppn +pemap +commap for now)
1. Do not pass +devices, better pass +ignoresharing
2. Try adding “twoawayx yes” in your namd script
On improvement try adding twoawayy.
On improvement try adding twoawayz.
This archive was generated by hypermail 2.1.6 : Tue Dec 27 2016 - 23:21:06 CST