Ammasso interconnect performance, LDB question

From: Dow Hurst (Dow.Hurst_at_mindspring.com)
Date: Mon Sep 05 2005 - 23:21:12 CDT

LDB: LOAD: AVG 1.41989 MAX 1.541 MSGS: TOTAL 518 MAXC 18 MAXP 7 None
LDB: LOAD: AVG 1.41989 MAX 1.44798 MSGS: TOTAL 518 MAXC 18 MAXP 7 Refine

What can I learn from this message? Is this the communication layer
logging performance measurements and also a refinement of the
interprocess communications in charm++? Is there information to be
gleaned about tuning the packet sizes of the switch that NAMD could tell
me? We are getting some very nice numbers from the Ammasso fast
interconnects. The apoa1 benchmark was 0.96ns/day for 38 Opteron 250
CPUs. Just using TCP alone without the RDMA of the Ammasso gave
1.3ns/day on the apoa1 benchmark.

Below is the beginning of the log file for a smaller system than the
apoa1 protein, I've truncated alot for brevity:

Info: NAMD 2.6b1 for Linux-amd64-MPI
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: and send feedback or bug reports to namd_at_ks.uiuc.edu
Info:
Info: Please cite Kale et al., J. Comp. Phys. 151:283-312 (1999)
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 50900 for mpi-linux-amd64
Info: Built Sun Aug 7 18:18:42 EDT 2005 by dhurst on heada
Info: Sending usage information to NAMD developers via UDP. Sent data is:
Info: 1 NAMD 2.6b1 Linux-amd64-MPI 38 node001a dhurst
Info: Running on 38 processors.
Info: 431814 kB of memory in use.
Info: Changed directory to .
Info: Configuration file is heat_fix_50K.conf
TCL: Suspending until startup complete.
Info: EXTENDED SYSTEM FILE ../IC1_dir/min_all_wt_ic1_4_N20000.xsc
Info: SIMULATION PARAMETERS:
Info: TIMESTEP 1
Info: NUMBER OF STEPS 100000
Info: STEPS PER CYCLE 20
Info: PERIODIC CELL BASIS 1 80 0 0
Info: PERIODIC CELL BASIS 2 0 80 0
Info: PERIODIC CELL BASIS 3 0 0 70
Info: PERIODIC CELL CENTER 0 0 0
Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: LOAD BALANCE STRATEGY Other
Info: LDB PERIOD 4000 steps
Info: FIRST LDB TIMESTEP 100
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: MAX SELF PARTITIONS 50
Info: MAX PAIR PARTITIONS 20
Info: SELF PARTITION ATOMS 125
Info: PAIR PARTITION ATOMS 200
Info: PAIR2 PARTITION ATOMS 400
Info: MIN ATOMS PER PATCH 100
Info: VELOCITY FILE ../IC1_dir/min_all_wt_ic1_4_N20000.vel
Info: CENTER OF MASS MOVING? NO
Info: DIELECTRIC 1
Info: EXCLUDE SCALED ONE-FOUR
Info: 1-4 SCALE FACTOR 1
Info: DCD FILENAME heat_50K_CB1_WT_fix_dhurst.dcd
Info: DCD FREQUENCY 1000
Info: DCD FIRST STEP 1000
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: NO EXTENDED SYSTEM TRAJECTORY OUTPUT
Info: NO VELOCITY DCD OUTPUT
Info: OUTPUT FILENAME heat_50K_CB1_WT_fix_dhurst
Info: BINARY OUTPUT FILES WILL BE USED
Info: NO RESTART FILE
Info: SWITCHING ACTIVE
Info: SWITCHING ON 8.5
Info: SWITCHING OFF 10
Info: PAIRLIST DISTANCE 11.5
Info: PAIRLIST SHRINK RATE 0.01
Info: PAIRLIST GROW RATE 0.01
Info: PAIRLIST TRIGGER 0.3
Info: PAIRLISTS PER CYCLE 2
Info: PAIRLISTS ENABLED
Info: MARGIN 0
Info: HYDROGEN GROUP CUTOFF 2.5
Info: PATCH DIMENSION 14
Info: ENERGY OUTPUT STEPS 1000
Info: TIMING OUTPUT STEPS 1000
Info: FIXED ATOMS ACTIVE
Info: LANGEVIN DYNAMICS ACTIVE
Info: LANGEVIN TEMPERATURE 50
Info: LANGEVIN DAMPING COEFFICIENT IS 10 INVERSE PS
Info: LANGEVIN DYNAMICS NOT APPLIED TO HYDROGENS
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCE 1e-06
Info: PME EWALD COEFFICIENT 0.312341
Info: PME INTERPOLATION ORDER 4
Info: PME GRID DIMENSIONS 32 32 64
Info: Attempting to read FFTW data from FFTW_NAMD_2.6b1_Linux-amd64-MPI.txt
Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
Info: Writing FFTW data to FFTW_NAMD_2.6b1_Linux-amd64-MPI.txt
Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 4
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: NONBONDED FORCES EVALUATED EVERY 2 STEPS
Info: RANDOM NUMBER SEED 1125977032
Info: USE HYDROGEN BONDS? NO
Info: COORDINATE PDB fix_H_cb1_wt_ic1_popc.pdb
Info: STRUCTURE FILE ../IC1_dir/cb1_wt_ic1_popc.psf
Info: PARAMETER file: CHARMM format!
Info: PARAMETERS par_all27_prot_lipid_unsat.inp
Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
Info: BINARY COORDINATES ../IC1_dir/min_all_wt_ic1_4_N20000.coor
Info: SUMMARY OF PARAMETERS:
Info: 180 BONDS
Info: 446 ANGLES
Info: 567 DIHEDRAL
Info: 49 IMPROPER
Info: 90 VDW
Info: 0 VDW_PAIRS
Warning: Converting binary file ../IC1_dir/min_all_wt_ic1_4_N20000.coor
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 42968 ATOMS
Info: 36577 BONDS
Info: 52021 ANGLES
Info: 64083 DIHEDRALS
Info: 889 IMPROPERS
Info: 0 EXCLUSIONS
Info: 4010 FIXED ATOMS
Info: 116874 DEGREES OF FREEDOM
Info: 16035 HYDROGEN GROUPS
Info: 1949 HYDROGEN GROUPS WITH ALL ATOMS FIXED
Info: TOTAL MASS = 254932 amu
Info: TOTAL CHARGE = 10 e
Info: *****************************
Info: Entering startup phase 0 with 443405 kB of memory in use.
Info: Entering startup phase 1 with 443405 kB of memory in use.
Info: Entering startup phase 2 with 460264 kB of memory in use.
Info: Entering startup phase 3 with 460599 kB of memory in use.
Info: PATCH GRID IS 5 (PERIODIC) BY 5 (PERIODIC) BY 5 (PERIODIC)
Warning: Converting binary file ../IC1_dir/min_all_wt_ic1_4_N20000.vel
Info: REMOVING COM VELOCITY 0 0 0
Info: LARGEST PATCH (77) HAS 418 ATOMS
Info: Entering startup phase 4 with 465987 kB of memory in use.
Info: PME using 32 and 32 processors for FFT and reciprocal sum.
Info: PME GRID LOCATIONS: 1 2 3 5 6 7 9 10 11 12 ...
Info: PME TRANS LOCATIONS: 1 2 3 5 6 7 9 10 11 12 ...
Info: Entering startup phase 5 with 465992 kB of memory in use.
Info: Entering startup phase 6 with 461149 kB of memory in use.
Measuring processor speeds... Done.
Info: Entering startup phase 7 with 461177 kB of memory in use.
Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
Info: NONBONDED TABLE SIZE: 705 POINTS
Info: ABSOLUTE IMPRECISION IN FAST TABLE FORCE: 1.05879e-22 AT 9.94673
Info: RELATIVE IMPRECISION IN FAST TABLE FORCE: 2.25699e-16 AT 9.94673
Info: ABSOLUTE IMPRECISION IN SCOR TABLE FORCE: 2.64698e-22 AT 9.94673
Info: RELATIVE IMPRECISION IN SCOR TABLE FORCE: 7.32729e-16 AT 9.94673
Info: ABSOLUTE IMPRECISION IN VDWA TABLE ENERGY: 1.75 AT 0.0441942
Info: RELATIVE IMPRECISION IN VDWA TABLE ENERGY: 1.65669e-14 AT 9.94673
Info: ABSOLUTE IMPRECISION IN VDWA TABLE FORCE: 19968 AT 0.0441942
Info: RELATIVE IMPRECISION IN VDWA TABLE FORCE: 6.16499e-15 AT 0.0441942
Info: ABSOLUTE IMPRECISION IN VDWB TABLE ENERGY: 4.1359e-25 AT 9.99687
Info: RELATIVE IMPRECISION IN VDWB TABLE ENERGY: 1.84656e-14 AT 9.94673
Info: ABSOLUTE IMPRECISION IN VDWB TABLE FORCE: 1.65436e-24 AT 9.99687
Info: RELATIVE IMPRECISION IN VDWB TABLE FORCE: 6.36625e-16 AT 9.94673
Info: Entering startup phase 8 with 462339 kB of memory in use.
Info: Finished startup with 462851 kB of memory in use.
ETITLE: TS BOND ANGLE DIHED
IMPRP ELECT VDW BOUNDARY
MISC KINETIC TOTAL TEMP
TOTAL2 TOTAL3 TEMPAVG PRESSURE
GPRESSURE VOLUME PRESSAVG GPRESSAVG

ENERGY: 0 3288.7705 4584.5624 3633.3733
9.4158 -132134.2552 4688.7451 0.0000
0.0000 0.0000 -115929.3880 0.0000
-115803.1628 -115803.1628 0.0000 -1926.0982
-1966.3179 448000.0000 -1926.0982 -1966.3179

Info: Initial time: 38 CPUs 0.0412583 s/step 0.477527 days/ns 464594 kB
memory
LDB: LOAD: AVG 1.42743 MAX 2.29712 MSGS: TOTAL 509 MAXC 18 MAXP 7 None
LDB: LOAD: AVG 1.42743 MAX 1.7112 MSGS: TOTAL 517 MAXC 18 MAXP 7 Alg7
LDB: LOAD: AVG 1.42743 MAX 1.45585 MSGS: TOTAL 517 MAXC 18 MAXP 7 Alg7
Info: Initial time: 38 CPUs 0.0329722 s/step 0.381623 days/ns 465494 kB
memory
LDB: LOAD: AVG 1.48383 MAX 1.83723 MSGS: TOTAL 517 MAXC 18 MAXP 7 None
LDB: LOAD: AVG 1.48383 MAX 1.51331 MSGS: TOTAL 517 MAXC 18 MAXP 7 Refine
Info: Initial time: 38 CPUs 0.0301535 s/step 0.348998 days/ns 465718 kB
memory
LDB: LOAD: AVG 1.46831 MAX 1.67254 MSGS: TOTAL 517 MAXC 18 MAXP 7 None
LDB: LOAD: AVG 1.46831 MAX 1.49749 MSGS: TOTAL 517 MAXC 18 MAXP 7 Refine
Info: Benchmark time: 38 CPUs 0.0323698 s/step 0.374651 days/ns 465771
kB memory
Info: Benchmark time: 38 CPUs 0.0248261 s/step 0.287339 days/ns 465797
kB memory
Info: Benchmark time: 38 CPUs 0.0257247 s/step 0.29774 days/ns 465835 kB
memory
TIMING: 1000 CPU: 30.3704, 0.0302854/step Wall: 30.5406,
0.0304465/step, 0.837278 hours remaining, 466500 kB of memory in use.
ENERGY: 1000 3202.7781 5833.8053 3843.2757
20.9519 -128997.7121 4511.0367 0.0000
0.0000 4287.9335 -107297.9308 36.9250
-107109.3960 -107162.6354 27.2693 865.0235
-1461.5879 448000.0000 -1498.1879 -1499.8686

OPENING COORDINATE DCD FILE
WRITING COORDINATES TO DCD FILE AT STEP 1000
WRITING COORDINATES TO DCD FILE AT STEP 2000
TIMING: 2000 CPU: 58.918, 0.0285477/step Wall: 59.1154,
0.0285748/step, 0.77787 hours remaining, 466963 kB of memory in use.
ENERGY: 2000 4045.4198 6080.2947 3861.3484
20.4174 -129337.6537 4360.0136 0.0000
0.0000 5120.6143 -105849.5456 44.0955
-105668.8210 -105645.3784 41.0040 -524.9522
-1602.8294 448000.0000 -1522.9299 -1521.8511

WRITING COORDINATES TO DCD FILE AT STEP 3000
TIMING: 3000 CPU: 87.3677, 0.0284497/step Wall: 87.6781,
0.0285627/step, 0.769605 hours remaining, 466994 kB of memory in use.
ENERGY: 3000 3893.8071 6181.0885 3855.1746
21.9567 -129088.0662 4470.1687 0.0000
0.0000 5502.3680 -105163.5026 47.3830
-104965.0591 -104985.6443 45.6691 252.7055
-1525.3740 448000.0000 -1582.2396 -1583.1057

TIMING: 4000 CPU: 113.153, 0.0257851/step Wall: 113.469,
0.0257907/step, 0.687752 hours remaining, 467022 kB of memory in use.
ENERGY: 4000 4186.7233 6242.6166 3864.1187
22.2668 -129346.7052 4406.5573 0.0000
0.0000 5625.8721 -104998.5504 48.4465
-104807.1114 -104789.6484 47.8670 -378.2075
-1664.0810 448000.0000 -1574.4901 -1573.7842

Thanks for your advice and time,
Dow Hurst

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:39:54 CST