From: Philip Peartree (P.Peartree_at_postgrad.manchester.ac.uk)
Date: Tue Sep 18 2007 - 10:34:22 CDT
It maybe that your system size is reaching the limits of scalability. 
My system
of ~44000 atoms stops scaling well at about 56 procs on our Itanium2/Quadrics
cluster...
The best measure of performance is the parallel efficiency, that is:
Time on (1 proc/n x Time on n procs) x 100
This should give you an idea of how well it scales, aim for more than 
60% is the
rule I've seen around. Sorry I don't have much more to add on 
diagnostics. What
kind of time per step are you getting?
Philip Peartree
Quoting Marcus Rölz <m.roelz_at_gmx.de>:
>
> Hello folks,
>
> i am observing severe performance problems on our Linux-Cluster.
> If i double the namd-processes from 1 to 2 performance doubles and 
> processor load is about 90%, but if i double up from 6 to 12 
> processes performance even drops below the 6 processor performance 
> and cpu usage is about 20%.
>
>
> The setup is the following:
>
> -32 machines, 2 dualcore processors each
> -Linux version 2.6.16.21-0.8-smp
> -Gigabit ethernet
> -simulated system with about 12.000 atoms
>
> I measured the network-performance with netperf and got about 900 
> Mbit/sec which should be fine. The clusternodes i use are not heavily 
> loaded.
>
> I am running out of ideas what to debug next. There must be a way to 
> systematically debug these performance issues - but i didn't figure 
> it out yet.
>
> Any help & good ideas would greatly be appreciated.
>
>
> Marcus (University of Greifswald)
>
>
>
> My config-file: (i already experimented with stepspercycle, twoAwayX, 
> outputtiming)
>
> structure          ./allwater_ws.psf
> coordinates        ./allwater_ws.pdb
>
> set temperature    310
> set outputname     allwater_wsout
>
> firsttimestep      0
>
> paraTypeCharmm      on
> parameters          ./par_all27_prot_na.prm
> temperature         $temperature
>
>
> # Force-Field Parameters
> exclude             scaled1-4
> 1-4scaling          1.0
> cutoff              12.
> switching           on
> switchdist          10.
> pairlistdist        14.5  ##13.5
>
>
> # Integrator Parameters
> timestep            2.0  ;# 2fs/step
> rigidBonds          all  ;# needed for 2fs steps
> nonbondedFreq       1
> fullElectFrequency  2
> stepspercycle       30
> twoAwayX yes
> outputTiming 20
>
> # Constant Temperature Control
> langevin            on    ;# do langevin dynamics
> langevinDamping     5     ;# damping coefficient (gamma) of 5/ps
> langevinTemp        $temperature
> langevinHydrogen    off    ;# don't couple langevin bath to hydrogens
>
> # Output
> outputName          $outputname
>
> restartfreq         5000     ;# 500steps = every 1ps
> dcdfreq             250
> outputEnergies      100
> outputPressure      100
>
>
> # Spherical boundary conditions
> sphericalBC         on
> sphericalBCcenter   21.3367443085 29.2859230042 26.6932468414
> sphericalBCr1       31.4426914717
> sphericalBCk1       10
> sphericalBCexp1     2
>
> # Minimization
> minimize            900  ;#100
> reinitvels          $temperature
>
> run 30000000 ;# 5ps = 2500
>
>
>
>
>
> --
> Der GMX SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
> Ideal für Modem und ISDN: http://www.gmx.net/de/go/smartsurfer
>
>
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:45:16 CST