Re: processor counts FFT and NAMD

From: Jim Phillips (jim_at_ks.uiuc.edu)
Date: Wed Mar 25 2009 - 11:13:56 CDT

Hi Tom,

It's quite normal. In this case NAMD is trying to decompose the PME grid
into 164 y-z planes and 164 x-z planes across 80 processors. One of the
rules is to use a minimum of two planes per processor, so it will never
use more than 82 processors. Since you have only 80, if you round up
164/80 to 3 planes per processor, and then round up 164/3 you get 55
processors. (Since the processors with 3 planes should be the
rate-limiting factor, NAMD doesn't try to spread the work around to more
processors than necessary.) Note that the PME GRID (y-z planes) and PME
TRANS (x-z planes) processors combined use every node except node 0, so in
this sense every processor is used for PME.

Incidentally, since 164 = 2 * 2 * 41 has a "large" prime factor, you would
be better off using 160 = 2^5 * 5 for your grid dimension. This should
also give you exactly 80 PME processors (I think this is an exception to
the rule of not using node 0 for PME).

-Jim

On Wed, 25 Mar 2009, Bishop, Thomas C wrote:

> Dear NAMD,
>
> Is is typical to have a significantly different processor count for the FFT calls than allocated for hte simulation? How does this affect performance and/or how can I increase FFT proc count?
>
> Here's relevant messages from my namdoutput
>
> grep proc dyn.out
> Info: Running on 80 processors.
> Info: REMOVING PATCHES FROM PROCESSOR 0
> Info: PME using 55 and 55 processors for FFT and reciprocal sum.
> LDB: Measuring processor speeds ... Done.
>
>
> HERe's complete output
> more dyn17.out
> Charm++> Running on MPI version: 2.0 multi-thread support: MPI_THREAD_SINGLE (max
> supported: MPI_THREAD_SINGLE)
> Charm warning> Randomization of stack pointer is turned on in Kernel, run 'echo 0
>> /proc/sys/kernel/randomize_va_space' as root to disable it. Thread migration may
> not work!
> Charm++> cpu topology info is being gathered!
> Charm++> 20 unique compute nodes detected!
> Info: NAMD 2.7b1 for Linux-x86_64
> Info:
> Info: Please visit http://www.ks.uiuc.edu/Research/namd/
> Info: and send feedback or bug reports to namd_at_ks.uiuc.edu
> Info:
> Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
> Info: in all publications reporting results obtained with NAMD.
> Info:
> Info: Based on Charm++/Converse 60100 for mpi-linux-x86_64
> Info: Built Tue Mar 24 13:17:48 CDT 2009 by root on admin-02-01
> Info: 1 NAMD 2.7b1 Linux-x86_64 80 compute-01-37 bishop
> Info: Running on 80 processors.
> Info: Charm++/Converse parallel runtime startup completed at 0.419275 s
> Info: 4.32088 MB of memory in use based on mallinfo
> Info: Configuration file is dyn17.conf
> TCL: Suspending until startup complete.
> Info: EXTENDED SYSTEM FILE dyn16.xsc
> Warning: The parameter fullElectFrequency now defaults to nonbondedFreq (1) rather
> than stepsPerCycle.
> Info: SIMULATION PARAMETERS:
> Info: TIMESTEP 2
> Info: NUMBER OF STEPS 500000
> Info: STEPS PER CYCLE 20
> Info: PERIODIC CELL BASIS 1 146.598 0 0
> Info: PERIODIC CELL BASIS 2 0 152.346 0
> Info: PERIODIC CELL BASIS 3 0 0 91.9832
> Info: PERIODIC CELL CENTER 77.2804 81.0183 48.4533
> Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
> Info: LOAD BALANCE STRATEGY New Load Balancers -- ASB
> Info: LDB PERIOD 4000 steps
> Info: FIRST LDB TIMESTEP 100
> Info: LAST LDB TIMESTEP -1
> Info: LDB BACKGROUND SCALING 1
> Info: HOM BACKGROUND SCALING 1
> Info: PME BACKGROUND SCALING 1
> Info: REMOVING PATCHES FROM PROCESSOR 0
> Info: MAX SELF PARTITIONS 20
> Info: MAX PAIR PARTITIONS 8
> Info: SELF PARTITION ATOMS 154
> Info: SELF2 PARTITION ATOMS 154
> Info: PAIR PARTITION ATOMS 318
> Info: PAIR2 PARTITION ATOMS 637
> Info: MIN ATOMS PER PATCH 100
> Info: VELOCITY FILE dyn16.vel
> Info: CENTER OF MASS MOVING INITIALLY? NO
> Info: DIELECTRIC 1
> Info: EXCLUDE SCALED ONE-FOUR
> Info: 1-4 SCALE FACTOR 0.83333
> Info: DCD FILENAME dyn17.dcd
> Info: DCD FREQUENCY 500
> Info: DCD FIRST STEP 500
> Info: DCD FILE WILL CONTAIN UNIT CELL DATA
> Info: XST FILENAME dyn17.xst
> Info: XST FREQUENCY 500
> Info: VELOCITY DCD FILENAME dyn17.dvd
> Info: VELOCITY DCD FREQUENCY 500
> Info: VELOCITY DCD FIRST STEP 500
> Info: OUTPUT FILENAME dyn17
> Info: BINARY OUTPUT FILES WILL BE USED
> Info: RESTART FILENAME rst
> Info: RESTART FREQUENCY 5000
> Info: RESTART FILES WILL NOT BE OVERWRITTEN
> Info: BINARY RESTART FILES WILL BE USED
> Info: SWITCHING ACTIVE
> Info: SWITCHING ON 10
> Info: SWITCHING OFF 12
> Info: PAIRLIST DISTANCE 14
> Info: PAIRLIST SHRINK RATE 0.01
> Info: PAIRLIST GROW RATE 0.01
> Info: PAIRLIST TRIGGER 0.3
> Info: PAIRLISTS PER CYCLE 2
> Info: PAIRLISTS ENABLED
> Info: MARGIN 2
> Info: HYDROGEN GROUP CUTOFF 2.5
> Info: PATCH DIMENSION 18.5
> Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
> Info: MOMENTUM OUTPUT STEPS 5000
> Info: TIMING OUTPUT STEPS 5000
> Info: PRESSURE OUTPUT STEPS 5000
> Info: LANGEVIN DYNAMICS ACTIVE
> Info: LANGEVIN TEMPERATURE 300
> Info: LANGEVIN DAMPING COEFFICIENT IS 0.2 INVERSE PS
> Info: LANGEVIN DYNAMICS NOT APPLIED TO HYDROGENS
> Warning: Option useGroupPressure is being enabled due to pressure control with rig
> idBonds.
> Info: BERENDSEN PRESSURE COUPLING ACTIVE
> Info: TARGET PRESSURE IS 1.01325 BAR
> Info: COMPRESSIBILITY ESTIMATE IS 4.57e-05 BAR^(-1)
> Info: RELAXATION TIME IS 5000 FS
> Info: APPLIED EVERY 1 STEPS
> Info: PRESSURE CONTROL IS GROUP-BASED
> Info: CELL FLUCTUATION IS ISOTROPIC
> Info: PARTICLE MESH EWALD (PME) ACTIVE
> Info: PME TOLERANCE 1e-06
> Info: PME EWALD COEFFICIENT 0.257952
> Info: PME INTERPOLATION ORDER 4
> Info: PME GRID DIMENSIONS 164 164 164
> Info: PME MAXIMUM GRID SPACING 1.5
> Info: Attempting to read FFTW data from FFTW_NAMD_2.7b1_Linux-x86_64.txt
> Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
> Info: Writing FFTW data to FFTW_NAMD_2.7b1_Linux-x86_64.txt
> Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 1
> Info: USING VERLET I (r-RESPA) MTS SCHEME.
> Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
> Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
> Info: RIGID BONDS TO HYDROGEN : ALL
> Info: ERROR TOLERANCE : 1e-08
> Info: MAX ITERATIONS : 100
> Info: RIGID WATER USING SETTLE ALGORITHM
> Info: RANDOM NUMBER SEED 1237995113
> Info: USE HYDROGEN BONDS? NO
> Info: Using AMBER format force field!
> Info: AMBER PARM FILE sys.parm
> Info: AMBER COORDINATE FILE sys.crd
> Info: Exclusions will be read from PARM file!
> Info: SCNB (VDW SCALING) 2
> Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
> Info: BINARY COORDINATES dyn16.coor
> Reading parm file (sys.parm) ...
> PARM file in AMBER 7 format
> Warning: Encounter 10-12 H-bond term
> Warning: Found 60548 H-H bonds.
> Info: SUMMARY OF PARAMETERS:
> Info: 68 BONDS
> Info: 148 ANGLES
> Info: 70 DIHEDRAL
> Info: 0 IMPROPER
> Info: 0 CROSSTERM
> Info: 0 VDW
> Info: 231 VDW_PAIRS
> Info: TIME FOR READING PDB FILE: 3.8147e-06
> Info:
> Info: Reading from binary file dyn16.coor
> Info: ****************************
> Info: STRUCTURE SUMMARY:
> Info: 206014 ATOMS
> Info: 206353 BONDS
> Info: 44853 ANGLES
> Info: 90335 DIHEDRALS
> Info: 0 IMPROPERS
> Info: 0 CROSSTERMS
> Info: 313030 EXCLUSIONS
> Info: 192477 RIGID BONDS
> Info: 425565 DEGREES OF FREEDOM
> Info: 74085 HYDROGEN GROUPS
> Info: TOTAL MASS = 1.29427e+06 amu
> Info: TOTAL CHARGE = -4.08352e-05 e
> Info: *****************************
> Info:
> Info: Entering startup at 13.9251 s, 38.0429 MB of memory in use
> Info: Startup phase 0 took 0.184281 s, 38.0413 MB of memory in use
> Info: Startup phase 1 took 38.4269 s, 66.9703 MB of memory in use
> Info: Startup phase 2 took 0.194505 s, 68.5459 MB of memory in use
> Info: PATCH GRID IS 7 (PERIODIC) BY 8 (PERIODIC) BY 4 (PERIODIC)
> Info: PATCH GRID IS 1-AWAY BY 1-AWAY BY 1-AWAY
> Info: Reading from binary file dyn16.vel
> Info: REMOVING COM VELOCITY 0.0122789 -0.0111696 0.019356
> Info: LARGEST PATCH (72) HAS 1032 ATOMS
> Info: CREATING 15437 COMPUTE OBJECTS
> Info: Startup phase 3 took 3.08311 s, 94.4295 MB of memory in use
> Info: PME using 55 and 55 processors for FFT and reciprocal sum.
> Info: PME GRID LOCATIONS: 1 3 5 6 7 9 10 11 13 14 ...
> Info: PME TRANS LOCATIONS: 1 2 4 5 6 8 9 10 12 13 ...
> Info: Startup phase 4 took 8.62039 s, 94.4324 MB of memory in use
> Info: Startup phase 5 took 1.06483 s, 69.9706 MB of memory in use
> LDB: Measuring processor speeds ... Done.
> Info: Startup phase 6 took 3.06811 s, 69.9777 MB of memory in use
> Info: CREATING 15437 COMPUTE OBJECTS
> Info: useSync: 0 useProxySync: 0
> Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
> Info: NONBONDED TABLE SIZE: 769 POINTS
> Info: Startup phase 7 took 0.269264 s, 72.2592 MB of memory in use
> Info: Startup phase 8 took 0.000176191 s, 72.3853 MB of memory in use
> Info: Finished startup at 68.8367 s, 72.3853 MB of memory in use
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:52:31 CST