Mysterious slow down in parallel

From: Roy Fernando (roy.nandos_at_gmail.com)
Date: Sat Oct 26 2013 - 10:05:13 CDT

Dear NAMD experts,

I recently started running NAMD in cluster and I initially played with my
system to determine what is the best combination of nodes and processors
for my simulation in the cluster. I only ran for a shot time interval.

The cluster contains 30 nodes each containing 8 cores.

I noticed a significant speed up from a single processor to 8 processors in
a single node. Then I chose 2 nodes (16 processors) and observed another
speed up. But when I increased the number of nodes to 3 or 4 the simulation
displayed a drastic slow down.

Can somebody please, suggest why probably the simulations slow down? I
higly appreciate your input;

Roy

Following is the table I made including these details.

  Job # Nodes #processors start up wall time 571825 1 1 7.5 2866 569 1 8 9
539 470 2 8 2.4 316 498 2 8 3 323 494 3 8 4500 500 4 8 16 4793
I submitted the job using the following command line;
qsub -l nodes=<#nodes> : ppn=<#processors> , walltime=<expected_wall_time>
<job_file_name>

and following is the contents of my job_file;
---------------------------------------------------------------------------------------------------------------------------------------------------
#!/bin/sh -l
# Change to the directory from which you originally submitted this job.
cd $PBS_O_WORKDIR
CONV_RSH=ssh
export CONV_RSH
# CONV_DAEMON=""
# export CONV_DAEMON
module load namd

NODES=`cat $PBS_NODEFILE`
NODELIST="$RCAC_SCRATCH/namd2-$PBS_JOBID.nodelist"
echo group main > "$NODELIST"

# charmrun "$NAMD_HOME/namd2" ++verbose +p$NUMPROCS ++nodelist "$NODELIST"
ubq_wb_eq.conf
charmrun "$NAMD_HOME/namd2" ++verbose +p16 ++nodelist "$NODELIST"
SOD_wb_eq0.conf
module unload namd
--------------------------------------------------------------------------------------------------------------------------------------------------------------

Following is my structure summary;

Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 50198 ATOMS
Info: 35520 BONDS
Info: 25502 ANGLES
Info: 15756 DIHEDRALS
Info: 1042 IMPROPERS
Info: 380 CROSSTERMS
Info: 0 EXCLUSIONS
Info: 47188 RIGID BONDS
Info: 103406 DEGREES OF FREEDOM
Info: 17790 HYDROGEN GROUPS
Info: 4 ATOMS IN LARGEST HYDROGEN GROUP
Info: 17790 MIGRATION GROUPS
Info: 4 ATOMS IN LARGEST MIGRATION GROUP
Info: TOTAL MASS = 308670 amu
Info: TOTAL CHARGE = -8 e
Info: MASS DENSITY = 0.946582 g/cm^3
Info: ATOM DENSITY = 0.0927022 atoms/A^3
Info: *****************************

Info: Entering startup at 7.15922 s, 14.8091 MB of memory in use
Info: Startup phase 0 took 0.0303071 s, 14.8092 MB of memory in use
Info: Startup phase 1 took 0.068871 s, 23.5219 MB of memory in use
Info: Startup phase 2 took 0.0307088 s, 23.9375 MB of memory in use
Info: Startup phase 3 took 0.0302751 s, 23.9374 MB of memory in use
Info: PATCH GRID IS 4 (PERIODIC) BY 4 (PERIODIC) BY 5 (PERIODIC)
Info: PATCH GRID IS 1-AWAY BY 1-AWAY BY 1-AWAY
Info: REMOVING COM VELOCITY 0.0178943 -0.00579233 -0.00948207
Info: LARGEST PATCH (29) HAS 672 ATOMS
Info: Startup phase 4 took 0.0571079 s, 31.7739 MB of memory in use
Info: PME using 1 and 1 processors for FFT and reciprocal sum.
Info: PME USING 1 GRID NODES AND 1 TRANS NODES
Info: PME GRID LOCATIONS: 0
Info: PME TRANS LOCATIONS: 0
Info: Optimizing 4 FFT steps. 1... 2... 3... 4... Done.
Info: Startup phase 5 took 0.0330172 s, 34.1889 MB of memory in use
Info: Startup phase 6 took 0.0302858 s, 34.1888 MB of memory in use
LDB: Central LB being created...
Info: Startup phase 7 took 0.030385 s, 34.1902 MB of memory in use
Info: CREATING 1526 COMPUTE OBJECTS
Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
Info: NONBONDED TABLE SIZE: 769 POINTS
Info: Startup phase 8 took 0.0399361 s, 39.2458 MB of memory in use
Info: Startup phase 9 took 0.030345 s, 39.2457 MB of memory in use
Info: Startup phase 10 took 0.000467062 s, 49.472 MB of memory in use
Info: Finished startup at 7.54093 s, 49.472 MB of memory in use

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:49 CST