Re: NAMD2.7b1 performance

From: Gianluca Interlandi (gianluca_at_u.washington.edu)
Date: Fri Sep 25 2009 - 11:53:05 CDT

Hi Myunggi,

It sounds reasonable. However, I would not use 86 CPUs for this system
because you might not get good scaling. As a comparison, I have a system
of 60000 atoms and on 32 CPUs with infiniband I get:

Info: Benchmark time: 32 CPUs 0.0414376 s/step 0.239801 days/ns 58.8619 MB
memory

You see, running on 32 CPUs, which is 1/3 of what you are using, I'm only
30% slower. Of course, this depends also on the particular hardware you
are using, but this is just to give you an idea. Adding more CPUs does not
necessarily make it faster because of the communication overhead.

As Axel points out, you should run benchmarks where you first run on 2
CPUs, than on 4, 8, etc. Plot the benchmark time versus the number of CPUs
and you will see that the line becomes more flat as you add more
processors.

Gianluca

On Fri, 25 Sep 2009, Myunggi Yi wrote:

> Dear NAMD users,
>
> I am running a simulation with 67005 atoms (lipid bilayer with water).
> Using openmpi, infiniband, and 86 CPU's, I've got the following performance.
> Is this reasonable?
>
>
> Info: Benchmark time: 86 CPUs 0.0318438 s/step 0.184281 days/ns 70.0844 MB
> memory
> Info: Benchmark time: 86 CPUs 0.0315817 s/step 0.182764 days/ns 70.0845 MB
> memory
> Info: Benchmark time: 86 CPUs 0.0312832 s/step 0.181037 days/ns 70.085 MB
> memory
>
>
> conf file
> ++++++++++++++++++++
> exclude scaled1-4
> 1-4scaling 1.0
> cutoff 12.
> switching on
> switchdist 10.
> pairlistdist 13.5
>
> timestep 2.0
> rigidBonds all
> nonbondedFreq 1
> fullElectFrequency 2
> stepspercycle 10
>
> langevin on
> langevinDamping 1.0
> langevinTemp $temperature
> langevinHydrogen off
>
> wrapAll on
>
> PME yes
> PMEGridSizeX 92
> PMEGridSizeY 92
> PMEGridSizeZ 90
>
> useGroupPressure yes
> useFlexibleCell yes
> useConstantRatio yes
>
> langevinPiston on
> langevinPistonTarget 1.01325
> langevinPistonPeriod 200.
> langevinPistonDecay 100.
> langevinPistonTemp $temperature
> ++++++++++++++++++++++++++++
>
>
> Due to the Charm warning, I used "+isomalloc_sync" option. I don't see any
> difference though.
>
>
> Charm++> Running on MPI version: 2.0 multi-thread support: MPI_THREAD_SINGLE
> (max supported: MPI_THREAD_SINGLE)
> Charm warning> Randomization of stack pointer is turned on in Kernel, run
> 'echo 0 > /proc/sys/kernel/randomize_va_space' as root to disable it. Thread
> migration may not work!
> Charm++> synchronizing isomalloc memory region...
> [0] consolidated Isomalloc memory region: 0x2ba9c0000000 - 0x7ffb00000000
> (88413184 megs)
> Charm++> cpu topology info is being gathered!
> Charm++> 17 unique compute nodes detected!
> Info: NAMD 2.7b1 for Linux-x86_64-MPI
> Info:
>
>
>
>
> --
> Best wishes,
>
> Myunggi Yi
> ==================================
> 91 Chieftan Way
> Institute of Molecular Biophysics
> Florida State University
> Tallahassee, FL 32306
>
> Office: +1-850-645-1334
>
> http://sites.google.com/site/myunggi/
> http://people.sc.fsu.edu/~myunggi/
>

-----------------------------------------------------
Gianluca Interlandi, PhD gianluca_at_u.washington.edu
                     +1 (206) 685 4435
                     +1 (206) 714 4303
                     http://artemide.bioeng.washington.edu/

Postdoc at the Department of Bioengineering
at the University of Washington, Seattle WA U.S.A.
-----------------------------------------------------

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:53:18 CST