Re: AW: AW: namd 2.9 not going so well

From: Michael Galloway (gallowaymd_at_ornl.gov)
Date: Tue Sep 18 2012 - 09:47:20 CDT

yes, we are in the process of doing some gromacs scaling tests. will get
back to this issue once the cluster is freed up.

thanks!

-- michael

On 09/17/2012 10:23 AM, Norman Geist wrote:
> So it's ok. Look at the Timing lines, the performance is the same ;) So you
> seem to could stand with the precompiled and IPoIB.
> I wonder why the benchmark time doesn't fit to the Timing lines. Maybe there
> was something interfering, you should repeat this test and make sure there's
> nothing else running on the nodes.
>
> Also I wonder why your self compiled binaries use 6 times more memory than
> the precompiled. Maybe you should search the problem here. What happens if
> you compile without mpi, but with charm++?
>
> Norman Geist.
>
>> -----Ursprüngliche Nachricht-----
>> Von: Michael Galloway [mailto:gallowaymd_at_ornl.gov]
>> Gesendet: Montag, 17. September 2012 14:17
>> An: Norman Geist
>> Cc: Namd Mailing List
>> Betreff: Re: AW: namd-l: namd 2.9 not going so well
>>
>> i did check those parameters, they were set as required:
>>
>> [root_at_cmbcluster ~]# fornodes "cat /sys/class/net/ib0/mtu"
>> ==================== node001 ====================
>> 65520
>> ==================== node002 ====================
>> 65520
>> ==================== node003 ====================
>> 65520
>> ==================== node004 ====================
>> 65520
>> ....
>>
>> [root_at_cmbcluster ~]# fornodes "cat /sys/class/net/ib0/mode"
>> ==================== node001 ====================
>> connected
>> ==================== node002 ====================
>> connected
>> ==================== node003 ====================
>> connected
>> ==================== node004 ====================
>> connected
>> ==================== node005 ====================
>> connected
>> ==================== node006 ====================
>> connected
>> .....
>>
>> On 09/17/2012 02:00 AM, Norman Geist wrote:
>>> Hi Micheal,
>>>
>>> have you set the mode and mtu settings for IPoIB like I turned out?
>>>
>>> Just do for ib0
>>>
>>> echo "connected" > /sys/class/net/ib0/mode
>>> echo "65520" > /sys/class/net/ib0/mtu
>>>
>>> fit the commands if you have other names for your interface or
>> another shell
>>> than bash. This should give you the expected performance. In the
>>> compatibility mode "datagram" (check with cat
>> /sys/class/net/ib0/mode) the
>>> performance is comparable with standard Gigabit Ethernet only.
>>>
>>> Regards
>>>
>>> Norman Geist.
>>>
>>>> -----Ursprüngliche Nachricht-----
>>>> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
>>>> Auftrag von Michael Galloway
>>>> Gesendet: Donnerstag, 13. September 2012 17:30
>>>> An: namd-l
>>>> Betreff: namd-l: namd 2.9 not going so well
>>>>
>>>> ok, i've been experimenting with namd built with openmpi and running
>>>> binaries over IPoIB and its not going so well.
>>>>
>>>> my namd2.9 built with gcc 4.7.1 and the apoa1 test data with 4 nodes
>> by
>>>> 12 cores per node is:
>>>>
>>>> [mgx_at_cmbcluster namd]$ mpirun -np 48 -machinefile nodes
>>>> /shared/namd-2.9-gcc471/Linux-x86_64-g++/namd2 apoa1/apoa1.namd
>>>>
>>>>
>>>> Info: Benchmark time: 48 CPUs 0.0357351 s/step 0.413601 days/ns
>> 309.887
>>>> MB memory
>>>> TIMING: 500 CPU: 19.0964, 0.0354278/step Wall: 19.0964,
>>>> 0.0354278/step, 0 hours remaining, 309.886719 MB of memory in use.
>>>> ETITLE: TS BOND ANGLE DIHED
>>>> IMPRP ELECT VDW BOUNDARY MISC
>>>> KINETIC TOTAL TEMP POTENTIAL TOTAL3
>>>> TEMPAVG PRESSURE GPRESSURE VOLUME PRESSAVG
>>>> GPRESSAVG
>>>>
>>>> ENERGY: 500 20974.8940 19756.6571 5724.4523
>>>> 179.8271 -337741.4155 23251.1001 0.0000 0.0000
>>>> 45359.0760 -222495.4090 165.0039 -267854.4849
>>>> -222061.0908 165.0039 -3197.5171 -2425.4142
>>>> 921491.4634 -3197.5171 -2425.4142
>>>>
>>>> WRITING EXTENDED SYSTEM TO OUTPUT FILE AT STEP 500
>>>> WRITING COORDINATES TO OUTPUT FILE AT STEP 500
>>>> The last position output (seq=-2) takes 0.002 seconds, 315.496 MB of
>>>> memory in use
>>>> WRITING VELOCITIES TO OUTPUT FILE AT STEP 500
>>>> The last velocity output (seq=-2) takes 0.002 seconds, 315.496 MB of
>>>> memory in use
>>>> ====================================================
>>>>
>>>> WallClock: 20.894575 CPUTime: 20.894575 Memory: 315.496094 MB
>>>> End of program
>>>>
>>>> same dataset, same node processor count for the NAMD_2.9_Linux-
>> x86_64
>>>> binary run over IBoIP:
>>>>
>>>> [mgx_at_cmbcluster NAMD_2.9_Linux-x86_64]$ ./charmrun ./namd2 +p48
>>>> ++nodelist ./nodelist ++remote-shell ssh apoa1/apoa1.namd
>>>>
>>>> Info: Benchmark time: 48 CPUs 0.278276 s/step 3.22078 days/ns
>> 54.4712
>>>> MB
>>>> memory
>>>> TIMING: 500 CPU: 17.0394, 0.0332949/step Wall: 120.797,
>>>> 0.246653/step,
>>>> 0 hours remaining, 54.471237 MB of memory in use.
>>>> ETITLE: TS BOND ANGLE DIHED
>>>> IMPRP ELECT VDW BOUNDARY MISC
>>>> KINETIC TOTAL TEMP POTENTIAL TOTAL3
>>>> TEMPAVG PRESSURE GPRESSURE VOLUME PRESSAVG
>>>> GPRESSAVG
>>>>
>>>> ENERGY: 500 20974.8941 19756.6576 5724.4523
>>>> 179.8271 -337741.4179 23251.1005 0.0000 0.0000
>>>> 45359.0770 -222495.4094 165.0039 -267854.4864
>>>> -222061.0912 165.0039 -3197.5171 -2425.4143
>>>> 921491.4634 -3197.5171 -2425.4143
>>>>
>>>> WRITING EXTENDED SYSTEM TO OUTPUT FILE AT STEP 500
>>>> WRITING COORDINATES TO OUTPUT FILE AT STEP 500
>>>> The last position output (seq=-2) takes 0.003 seconds, 59.168 MB of
>>>> memory in use
>>>> WRITING VELOCITIES TO OUTPUT FILE AT STEP 500
>>>> The last velocity output (seq=-2) takes 0.004 seconds, 59.014 MB of
>>>> memory in use
>>>> ====================================================
>>>>
>>>> WallClock: 141.985214 CPUTime: 21.389748 Memory: 59.014664 MB
>>>>
>>>> i still get random segfaults with my compiled namd :-\

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:22:05 CST