From: Hannes Loeffler (Hannes.Loeffler_at_stfc.ac.uk)
Date: Thu Jan 23 2014 - 04:05:18 CST
On Wed, 22 Jan 2014 21:38:29 -0500
Giacomo Fiorin <giacomo.fiorin_at_gmail.com> wrote:
> Without reading the paper in detail (I only saw Figures 6-8), I think
> you should try to obtain the original input files used for each
> program, in particular the cutoffs, PME grid resolution, time steps,
> etc.
>
> It is not rare to find that the input files do not necessarily have
> the same parameters: I once saw a comparison between program X
> running with a 8 Å cutoff and 1.5 Å Ewald grid, vs. program Y running
> with a 12 Å cutoff and 0.8 Å Ewald grid. I'd leave it up to you to
> judge the accuracy of such comparison.
>
> Ultimately, benchmarks should be considered as any other scientific
> data: they must be reproducible.
The paper appears to be completely based on my benchmark suite which is
readily available with all necessary input files from
http://www.stfc.ac.uk/CSE/randd/cbg/Benchmark/25241.aspx . So
reproducibility shouldn't be a problem provided the authors haven't
changed the input parameters (except probably for the new CHARMM code
as needed) or documented such. In the associated reports to the suite
I also tried to make it clear as possible that the user should be
careful about comparison and try to give advice how performance could
possibly be improved. There I also encourage the user to carry out
benchmarks of their own simulation systems on their chosen hardware.
Those benchmarks, in particular Fig 8, may show some weakness of NAMD
on this particular hardware configuration (the NAMD developers may be
able to comment on that) or could also show a problem how the
benchmarks were run. Regarding hardware, the benchmarks may look quite
different on different hardware, i.e. those benchmarks only tell us how
the codes performed on the specific hardware the authors have chosen
(that's certainly a limiting factor and the authors don't tell us too
much about their hardware). I personally have never seen such a
"dramatic" (see below for comments on that) drop in performance of NAMD
as depicted in Fig 8 but then I should probably add that most of my
benchmarking was done on "super-computers".
It should also be noted that in Fig 8 the absolute performance of NAMD
at 256 cores is still slightly better than Gromacs at 512 cores.
We don't see when GROMACS' performance peaks or when it starts to drop.
Also, at 256 cores NAMD performs twice as fast as Gromacs! If you have
to pay for your usage you will probably think twice if you want to run
NAMD or GROMACS on that particular system on that particular
hardware. You really ought to think how much cores you can afford in
actual work. Good science work dictates multiple runs for statistical
purposes/reproducibility (=independent runs) and you probably also have
to compete for resources with other users, etc. It's interesting to see
that GROMACS appears to be so badly performing on a per-core basis as
that's quite the opposite of what I have seen so far.
There is probably much more that I could say here. Partly I have tried
to discuss this in my benchmark reports. But the summary is that users
should look very carefully at what benchmark data really mean and in
particular what in means for their very own personal circumstances.
Personally, I don't really see that the performance of NAMD in Fig. 8 is
a problem in practical work. In fact, I would say NAMD looks really
great.
Cheers,
Hannes.
> On Wed, Jan 22, 2014 at 7:22 PM, Bennion, Brian <Bennion1_at_llnl.gov>
> wrote:
>
> > Hello,
> >
> >
> >
> > Based on this recent publication
> > http://onlinelibrary.wiley.com/doi/10.1002/jcc.23501/abstract
> >
> > NAMD2.9 stumbles compared with gromacs and an improved version of
> > charmm on a large system (465404 atoms and 500 cores).
> >
> > Any ideas as to the cause of this dramatic difference in speed
> > between 256 and 400 cores?
> >
> >
> >
> > Brian
-- Scanned by iCritical.
This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:20:25 CST