Re: Errata to: Update to CUDA error in NAMD 2.7: Increase MAX_EXCLUSIONS: problem persists and CPU-only MD scales poorly

From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Tue Dec 28 2010 - 03:46:37 CST

hi pietro,

On Mon, Dec 27, 2010 at 3:20 PM, Pietro Amodeo <pamodeo_at_icmib.na.cnr.it> wrote:
> Hi,
>
> sorry but the table in my last post is wrong:
> 1) obviously, the reported ratio is Time(1)/Time(N) and NOT
> Time(N)/Time(1)!!!!
> 2) the correct figures are:
> N     Time(1)/Time(N)
>  1    1
>  2    1.9733511924
>  4    3.5960034869
>  6    5.1641581203
>  8    6.5367137981
> 10   8.0500773076
> 12   9.1171710303
>
> 16   8.8086727989
>
> 20   9.6037249284
> 22  10.103089676
> 24  10.6848376171

those timings are fairly good.
i don't know what you are complaining about.

you really have only 12 physical CPU cores on
your machine and about 10-15% extra speed from
hyper-threading is quite typical for this kind of setup.

the fact that you don't get perfect scaling can be easily
explained by two reasons: memory bandwidth contention
overall and lack of processor affinity that makes the
contention worse.

memory contention is the worst the larger the system is
as that makes CPU caches less efficient. overall, also
the topology and size of caches has an impact to performance
and scaling.

as for your CUDA version problem. that looks like a compile
time issue. you'll have to examine the source code and see,
if you can adjust the mentioned parameter. on GPUs the
memory (and cache) architecture is different from CPUs and
sometimes one has to choose what works well for most
typical cases and require a recompilation with changed
parameters. due to continued improvements in the CUDA
programming interface and the CUDA drivers, this situation
will improve in the future (e.g. with JIT compilation and selection
of kernels suitable for specific needs)

cheers,
     axel.

>
> Pietro
>
> On Lun, Dicembre 27, 2010, 3:01 pm, Pietro Amodeo disse:
>> Hello,
>>
>> one month ago I posted a message (title: CUDA error in NAMD 2.7: Increase
>> MAX_EXCLUSIONS) about a systematic error I obtain when trying to simulate
>> a relatively large (120978 atoms) system on a 2 CPU (Xeon 5650 SixCore)
>> workstation (seen as a 24-core machine)  equipped with two CUDA boards.
>> For detail about errors, including further HW info and a typical output
>> from a failed run, please see my previous post.
>>
>> Meantime, I've performed further tests, by changing more radically the
>> simulation conditions (e.g. by switching to NVT runs), but the number of
>> exclusions and the error message didn't change. So, I'd like to know if
>> this limitation can be circumvented or the system is just too large (or
>> its composition generates this pathological behaviour).
>>
>> By switching to pure CPU simulations, jobs run flawlessly but, when
>> submitting test simulations of 10.000 MD steps each on a variable number
>> of cores, the results obtained using "new load balancers -- ASB" show a
>> scaling that, considering that simulations were run on a single machine,
>> are far from ideal and, in any case, worse than those observed with NAMD
>> 2.6 on a 112-core cluster (with dual-opteron 8-core nodes and infiniband
>> connection). Unfortunately, although the simulation setup was very
>> similar, the systems tested on the cluster were different (smaller) and
>> presently I can't align the two sets of benchmarks.
>>
>> Here is a table of the relative scalings vs. the number of employed cores,
>> obtained using NAMD 2.7 on the 24-core workstation:
>>
>> N  Time(N)/Time(1)
>>  1  1
>>  2  1.8950753798
>>  4  3.4533628378
>>  6  4.9593143628
>>  8  6.277425646
>> 10  7.7307594158
>> 12  8.7555253315
>>
>> 16  8.4592641261
>>
>> 20  9.2227793696
>> 22  9.7023360965
>> 24  10.261008169
>>
>> Comments/suggestions about both the errors for the GPU, and the scaling
>> for the CPU versions of NAMD 2.7 are welcome.
>> Again,  I can provide any other information or execute tests that may be
>> useful for the resolution of the problem.
>>
>> Thanks in advance,
>> Pietro
>>
>>
>
>
> --
> Dr. Pietro Amodeo
> Istituto di Chimica Biomolecolare del CNR
> Comprensorio "A. Olivetti", Edificio 70
> Via Campi Flegrei 34
> I-80078 Pozzuoli (Napoli) - Italy
> Phone      +39-0818675072
> Fax        +39-0818041770
> Email    pamodeo_at_icmib.na.cnr.it
>
>

-- 
Dr. Axel Kohlmeyer
akohlmey_at_gmail.com  http://goo.gl/1wk0
Institute for Computational Molecular Science
Temple University, Philadelphia PA, USA.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:54:54 CST