Re: Intel 11 compilers

From: Guanglei Cui (amber.mail.archive_at_gmail.com)
Date: Fri Jul 10 2009 - 10:45:17 CDT

Hi Xavier,

I've been doing some benchmarks lately, comparing the performance of
the precompiled and Intel (v11) compiled binaries (with multicore
enabled) on 3 boxes. The test system is apoa1 (~92K atoms, 500 MD
steps). What I observe is similar that the scaling degrades beyond 16
CPUs for systems of this size. The Intel compiled binary does have
some advantages in terms of scaling, but it's still slower (in 2 of 3
cases) on a single CPU than the precompiled 2.7b1 downloaded from NAMD
website. I wonder if there is any optimization options I missed. It'd
be nice to be able to match the single-CPU performance. I hope experts
on this list could have more input on this. Here are the numbers.

# of CPUs/cores Precompiled In-house Compiled
1 909.94 (1.00) 807.38 (1.00)
2 499.46 (1.82) 404.00 (2.00)
4 270.63 (3.36) 228.37 (3.54)
8 154.39 (5.89) 131.72 (6.13)
16 91.71 (9.92) 77.80 (10.38)
32 76.01 (11.97) 94.80 (8.52)
Table 2: umwus040 [Intel Xeon X7350 2.93GHz (64 total), and RHEL 4 (Nahant
Update 7)]

# of CPUs/cores Precompiled In-house Compiled
1 515.72 (1.00) 541.37 (1.00)
2 285.82 (1.80) 270.67 (2.00)
4 145.33 (3.54) 142.41 (3.80)
8 75.47 (6.83) 71.24 (7.60)
Table 1: uptuw425 [Intel Xeon W5580 3.2GHz (8 total), and RHEL 5.3
(Tikanga)]

# of CPUs/cores Precompiled In-house Compiled
1 790.76 (1.00) 835.93 (1.00)
2 417.49 (1.89) 426.54 (1.96)
4 220.81 (3.58) 218.78 (3.82)
8 119.97 (6.59) 112.83 (7.40)
16 72.05 (10.98) 61.18 (13.66)
24 57.06 (13.86) 52.13 (16.04)
Table 3: uptuw425 [Intel Xeon E7450 2.4GHz (24 total), and RHEL 5.3
(Tikanga)]

Regards,
Guanglei

On Fri, Jul 10, 2009 at 9:52 AM, Xavier Deupi<xdeupi_at_gmail.com> wrote:
> Hi everybody,
> We have installed the the NAMD_2.7b1_Linux-x86_64-TCP binaries in a 64-bit
> Intel Xeon processors cluster with 16GB memory and a 10Gbit ethernet
> connection.
> We plan to simulate a system consisting of ~100.000 atoms, and we have been
> doing some tests (see below). It seems that the scalability degrades
> significantly going from 32 to 64 processors. I have no experience in NAMD,
> so are these values within the expected behavior of NAMD?
> Also, it seems that the current binaries have not been compiled with the
> last version (11.1) of the Intel compilers. Am I right? If so, do you think
> that recompiling with this version would result in a noticeable increase of
> performance?
> Thanks,
> Xavier
> P.S. Performance so far:
>  8 procs WallClock: 800.417664
> 16 procs WallClock: 500.420837    x1.60
> 32 procs WallClock: 342.890442    x2.33
> 64 procs WallClock: 298.141724    x2.68
>
> Xavier Deupi
> Laboratory of Computational Medicine
> Biostatistics Unit. School of Medicine
> Universitat Autonoma de Barcelona
> Bellaterra, 08193 (Barcelona)
> Catalunya, EU
>
>
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:53:01 CST