Re: Always 24-way SMP?

From: Andrew Pearson (andrew.j.pearson_at_gmail.com)
Date: Mon Feb 25 2013 - 11:50:13 CST

Hello again Norman

Yes, this was exactly the problem. I disabled hyperthreading on a compute
node and performed my scaling test again, and this time the results were
perfect. The speedup is now linear, and I get 12.3x for a 16-core run on a
single 16-core node. Thank you for your advice and for pointing out this
problem -- this would have affected many of our users, and not just NAMD
users.

Andrew

On Mon, Feb 25, 2013 at 10:06 AM, Norman Geist <
norman.geist_at_uni-greifswald.de> wrote:

> Andrew,****
>
> ** **
>
> what kind of cpu are you using on this node. What you experience remembers
> me on hyper threading. Could it be that your machine has only 12 physical
> cores, and the rest are the hyper threading “logical” cores? If so, it’s no
> wonder that namd can’t get any benefits out of the virtual cores (actually
> only a second command schedule per physical core), which are usually
> thought to better fill up spaces in the cpu schedule when doing
> multitasking, as tasks also produce wait times for example with disk IO. As
> namd doesn’t leave to much spaces because of being a highly optimized code,
> the maximum speedup of 12 is reasonable.****
>
> So I think you have two six-core cpus on your node. Please let us know
> this first.****
>
> ** **
>
> Furthermore, I never observed problems with the precompiled namd builds.
> And most things I read about it, were about infiniband and ofed stuff.
> Also, this problems were about succesfully starting namd, but not about bad
> parallel scaling.****
>
> ** **
>
> Norman Geist.****
>
> ** **
>
> *Von:* Andrew Pearson [mailto:andrew.j.pearson_at_gmail.com]
> *Gesendet:* Montag, 25. Februar 2013 13:28
> *An:* Norman Geist
> *Cc:* Namd Mailing List
> *Betreff:* Re: namd-l: Always 24-way SMP?****
>
> ** **
>
> Hi Norman****
>
> ** **
>
> Thanks for the response. I didn't phrase my question well - I know I'm
> experiencing scaling problems, and I'm trying to determine whether
> precompiled namd binaries are known to cause problems. I ask this since
> many people seem to say that you should compile namd yourself to save
> headaches.****
>
> ** **
>
> Your explanation about charm++ displaying information about the number of
> cores makes sense. I'll bet that's what's happening.****
>
> ** **
>
> My scaling problem is that for a given system (27 patches, 50000 atoms) I
> get perfect speedup until nprocs = 12 and then the speedup line goes almost
> flat. This occurs for runs performed on a single 16 core node. ****
>
> ** **
>
> Andrew****
>
>
> On Monday, February 25, 2013, Norman Geist wrote:****
>
> Hi Andrew,****
>
> ****
>
> it’s a bad idea to ask someone else if you have scaling problems. You
> should know if you have or not. The information from the outfile just comes
> from the charm++ startup and is simply a information about the underlying
> hardware. It doesn’t mean it uses smp. It just tells you it’s a
> multiprocessor/multicore node. Watch the output carefully and you will see
> IMHO that it uses the right number of cpus (for example the Benchmark
> lines). So what kind of scaling problems you have? Don’t you get the
> expected speedup?****
>
> ****
>
> Norman Geist.****
>
> ****
>
> *Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
> Auftrag von *Andrew Pearson
> *Gesendet:* Freitag, 22. Februar 2013 19:30
> *An:* namd-l_at_ks.uiuc.edu
> *Betreff:* namd-l: Always 24-way SMP?****
>
> ****
>
> I'm investigating scaling problems with NAMD. I'm running precompiled
> linux-64-tcp binaries on a linux cluster with 12-core nodes using "charmrun
> +p $NPROCS ++mpiexec".
>
> I know scaling problems have been covered, but I can't find the answer to
> my specific question. No matter how many cores I use or how many nodes
> they are spread over, at the top of stdout charm++ always reports "Running
> on # unique compute nodes (24-way SMP)". It gets # correct, but it's
> always 24-way SMP. Is this supposed to be this way? If so, why?****
>
> Everyone seems to say that you should recompile NAMD with your own MPI
> library, but I don't seem to have problems running NAMD jobs to completion
> with charmrun + OpenMPI built with intel (except for the scaling). Could
> using the precompiled binaries result in scaling problems?
>
> Thank you.****
>

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:20:57 CST