Performance Problems on Linux Cluster / Bad Scaling

From: QMarcus_R=F6lz=22?= (m.roelz_at_gmx.de)
Date: Tue Sep 18 2007 - 08:17:15 CDT

Hello folks,

i am observing severe performance problems on our Linux-Cluster.
If i double the namd-processes from 1 to 2 performance doubles and processor load is about 90%, but if i double up from 6 to 12 processes performance even drops below the 6 processor performance and cpu usage is about 20%.

The setup is the following:

-32 machines, 2 dualcore processors each
-Linux version 2.6.16.21-0.8-smp
-Gigabit ethernet
-simulated system with about 12.000 atoms

I measured the network-performance with netperf and got about 900 Mbit/sec which should be fine. The clusternodes i use are not heavily loaded.

I am running out of ideas what to debug next. There must be a way to systematically debug these performance issues - but i didn't figure it out yet.

Any help & good ideas would greatly be appreciated.

Marcus (University of Greifswald)

My config-file: (i already experimented with stepspercycle, twoAwayX, outputtiming)

structure ./allwater_ws.psf
coordinates ./allwater_ws.pdb

set temperature 310
set outputname allwater_wsout

firsttimestep 0

paraTypeCharmm on
parameters ./par_all27_prot_na.prm
temperature $temperature

# Force-Field Parameters
exclude scaled1-4
1-4scaling 1.0
cutoff 12.
switching on
switchdist 10.
pairlistdist 14.5 ##13.5

# Integrator Parameters
timestep 2.0 ;# 2fs/step
rigidBonds all ;# needed for 2fs steps
nonbondedFreq 1
fullElectFrequency 2
stepspercycle 30
twoAwayX yes
outputTiming 20

# Constant Temperature Control
langevin on ;# do langevin dynamics
langevinDamping 5 ;# damping coefficient (gamma) of 5/ps
langevinTemp $temperature
langevinHydrogen off ;# don't couple langevin bath to hydrogens

# Output
outputName $outputname

restartfreq 5000 ;# 500steps = every 1ps
dcdfreq 250
outputEnergies 100
outputPressure 100

# Spherical boundary conditions
sphericalBC on
sphericalBCcenter 21.3367443085 29.2859230042 26.6932468414
sphericalBCr1 31.4426914717
sphericalBCk1 10
sphericalBCexp1 2

# Minimization
minimize 900 ;#100
reinitvels $temperature

run 30000000 ;# 5ps = 2500

-- 
Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! 
Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:45:16 CST