Re: Re: q=AD=94=E5=A4=8D=3A_namd-l=3A_compilation_of?= namd

From: Bjoern Olausson (
Date: Thu May 12 2011 - 08:11:57 CDT

On Thursday 12 May 2011 11:53:58 Axel Kohlmeyer wrote:
> On Thu, May 12, 2011 at 3:45 AM, Bjoern Olausson
> <> wrote:
> > On Wednesday 11 May 2011 17:53:59 Jim Phillips wrote:
> >> I meant more conservative in the sense of not breaking code, but Axel's
> >> suggestion is correct. Changing -O3 to -O2 and adding -no-vec improves
> >> performance on newer compilers without hurting it on the old ones. I
> >> also bumped -march up from pentiumpro to pentium4 on 32-bit builds to
> >> enable some explicit SSE2 code. I think ten years is long enough to
> >> wait.
> >
> > Hi, I was wondering if anyone has optimized the flags for Opteron (2378
> > and 2427) CPUs and intel compiler?
> >
> > So far I am using "-O3 -xSSSE3" and from what I read here I might add
> > "-no- vec" and try to revert back to "-O2"
> >
> > But since the icc help only list pentium3 and pentium4 for -march I am
> > unshure what to choose here if anything at all.
> pentium3 or core are closer to opteron arch then pentium4
> there is not much to be gained by arch specific optimization,
> but rather by having an executable that is cache efficient and
> pipelines well.
> > Did omeone try to let the "-xHost" flag do the search for the highest
> > optimisation?
> as was mentioned before, the "highest" optimization may not result
> in the fastest binary. how much impact a specific flag has is often
> hard to measure and even harder to predict. this gets even more
> complicated by the fact that aggressive optimization may lead to
> a faster binary, but that may produce "wrong" results.
> you cannot trust the vendor description, since for the most part
> they have to add new features that make people buy their new
> software and they have to make people feel good about having
> their new compiler more than providing real performance increase
> (which is really difficult). furthermore, compilers are tested and
> benchmarked against a variety of workloads, while MD is just
> one specific subset. what works good overall, may be bad for
> a specific case.

I was affraid of such an answere ;-)

Thanks for the information. I 'll test if namd performs better with -O2 and -
no-vec then with my current flags and leave it that way.


Bjoern Olausson
Martin-Luther-Universitt Halle-Wittenberg 
Fachbereich Biochemie/Biotechnologie
Kurt-Mothes-Str. 3
06120 Halle/Saale
Phone: +49-345-55-24942

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:57:06 CST