AW: AW: namd 2.9 not going so well

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Mon Sep 17 2012 - 09:23:36 CDT

So it's ok. Look at the Timing lines, the performance is the same ;) So you
seem to could stand with the precompiled and IPoIB.
I wonder why the benchmark time doesn't fit to the Timing lines. Maybe there
was something interfering, you should repeat this test and make sure there's
nothing else running on the nodes.

Also I wonder why your self compiled binaries use 6 times more memory than
the precompiled. Maybe you should search the problem here. What happens if
you compile without mpi, but with charm++?

Norman Geist.

> -----Ursprüngliche Nachricht-----
> Von: Michael Galloway [mailto:gallowaymd_at_ornl.gov]
> Gesendet: Montag, 17. September 2012 14:17
> An: Norman Geist
> Cc: Namd Mailing List
> Betreff: Re: AW: namd-l: namd 2.9 not going so well
>
> i did check those parameters, they were set as required:
>
> [root_at_cmbcluster ~]# fornodes "cat /sys/class/net/ib0/mtu"
> ==================== node001 ====================
> 65520
> ==================== node002 ====================
> 65520
> ==================== node003 ====================
> 65520
> ==================== node004 ====================
> 65520
> ....
>
> [root_at_cmbcluster ~]# fornodes "cat /sys/class/net/ib0/mode"
> ==================== node001 ====================
> connected
> ==================== node002 ====================
> connected
> ==================== node003 ====================
> connected
> ==================== node004 ====================
> connected
> ==================== node005 ====================
> connected
> ==================== node006 ====================
> connected
> .....
>
> On 09/17/2012 02:00 AM, Norman Geist wrote:
> > Hi Micheal,
> >
> > have you set the mode and mtu settings for IPoIB like I turned out?
> >
> > Just do for ib0
> >
> > echo "connected" > /sys/class/net/ib0/mode
> > echo "65520" > /sys/class/net/ib0/mtu
> >
> > fit the commands if you have other names for your interface or
> another shell
> > than bash. This should give you the expected performance. In the
> > compatibility mode "datagram" (check with cat
> /sys/class/net/ib0/mode) the
> > performance is comparable with standard Gigabit Ethernet only.
> >
> > Regards
> >
> > Norman Geist.
> >
> >> -----Ursprüngliche Nachricht-----
> >> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
> >> Auftrag von Michael Galloway
> >> Gesendet: Donnerstag, 13. September 2012 17:30
> >> An: namd-l
> >> Betreff: namd-l: namd 2.9 not going so well
> >>
> >> ok, i've been experimenting with namd built with openmpi and running
> >> binaries over IPoIB and its not going so well.
> >>
> >> my namd2.9 built with gcc 4.7.1 and the apoa1 test data with 4 nodes
> by
> >> 12 cores per node is:
> >>
> >> [mgx_at_cmbcluster namd]$ mpirun -np 48 -machinefile nodes
> >> /shared/namd-2.9-gcc471/Linux-x86_64-g++/namd2 apoa1/apoa1.namd
> >>
> >>
> >> Info: Benchmark time: 48 CPUs 0.0357351 s/step 0.413601 days/ns
> 309.887
> >> MB memory
> >> TIMING: 500 CPU: 19.0964, 0.0354278/step Wall: 19.0964,
> >> 0.0354278/step, 0 hours remaining, 309.886719 MB of memory in use.
> >> ETITLE: TS BOND ANGLE DIHED
> >> IMPRP ELECT VDW BOUNDARY MISC
> >> KINETIC TOTAL TEMP POTENTIAL TOTAL3
> >> TEMPAVG PRESSURE GPRESSURE VOLUME PRESSAVG
> >> GPRESSAVG
> >>
> >> ENERGY: 500 20974.8940 19756.6571 5724.4523
> >> 179.8271 -337741.4155 23251.1001 0.0000 0.0000
> >> 45359.0760 -222495.4090 165.0039 -267854.4849
> >> -222061.0908 165.0039 -3197.5171 -2425.4142
> >> 921491.4634 -3197.5171 -2425.4142
> >>
> >> WRITING EXTENDED SYSTEM TO OUTPUT FILE AT STEP 500
> >> WRITING COORDINATES TO OUTPUT FILE AT STEP 500
> >> The last position output (seq=-2) takes 0.002 seconds, 315.496 MB of
> >> memory in use
> >> WRITING VELOCITIES TO OUTPUT FILE AT STEP 500
> >> The last velocity output (seq=-2) takes 0.002 seconds, 315.496 MB of
> >> memory in use
> >> ====================================================
> >>
> >> WallClock: 20.894575 CPUTime: 20.894575 Memory: 315.496094 MB
> >> End of program
> >>
> >> same dataset, same node processor count for the NAMD_2.9_Linux-
> x86_64
> >> binary run over IBoIP:
> >>
> >> [mgx_at_cmbcluster NAMD_2.9_Linux-x86_64]$ ./charmrun ./namd2 +p48
> >> ++nodelist ./nodelist ++remote-shell ssh apoa1/apoa1.namd
> >>
> >> Info: Benchmark time: 48 CPUs 0.278276 s/step 3.22078 days/ns
> 54.4712
> >> MB
> >> memory
> >> TIMING: 500 CPU: 17.0394, 0.0332949/step Wall: 120.797,
> >> 0.246653/step,
> >> 0 hours remaining, 54.471237 MB of memory in use.
> >> ETITLE: TS BOND ANGLE DIHED
> >> IMPRP ELECT VDW BOUNDARY MISC
> >> KINETIC TOTAL TEMP POTENTIAL TOTAL3
> >> TEMPAVG PRESSURE GPRESSURE VOLUME PRESSAVG
> >> GPRESSAVG
> >>
> >> ENERGY: 500 20974.8941 19756.6576 5724.4523
> >> 179.8271 -337741.4179 23251.1005 0.0000 0.0000
> >> 45359.0770 -222495.4094 165.0039 -267854.4864
> >> -222061.0912 165.0039 -3197.5171 -2425.4143
> >> 921491.4634 -3197.5171 -2425.4143
> >>
> >> WRITING EXTENDED SYSTEM TO OUTPUT FILE AT STEP 500
> >> WRITING COORDINATES TO OUTPUT FILE AT STEP 500
> >> The last position output (seq=-2) takes 0.003 seconds, 59.168 MB of
> >> memory in use
> >> WRITING VELOCITIES TO OUTPUT FILE AT STEP 500
> >> The last velocity output (seq=-2) takes 0.004 seconds, 59.014 MB of
> >> memory in use
> >> ====================================================
> >>
> >> WallClock: 141.985214 CPUTime: 21.389748 Memory: 59.014664 MB
> >>
> >> i still get random segfaults with my compiled namd :-\

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:22:05 CST