AW: LES very slow

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Thu Mar 07 2013 - 00:47:59 CST

Ok, it’s not surprising that additional used features increase runtime,
nevertheless from there on, it should scale again, so I think, or at least
better. The settings you posted for the ib0 network had a very low mtu,
maybe ask your admins if they can increase it to 65520, which always
improved scaling a lot, when I tested it. Or use an ibverbs binary of namd.
It seems that using LES need a lot more of bandwidth.

 

Additionally to improve scaling, try +idlepoll to the namd2 command which
gives a very nice gain over multiple nodes sometimes, as it decreases
latency. Also it cannot harm the timing so is not dangerous to use and have
not to be check for being faster every time.

 

Norman Geist.

 

Von: Siri Søndergaard [mailto:siris2501_at_gmail.com]
Gesendet: Donnerstag, 7. März 2013 04:51
An: Norman Geist
Betreff: Re: namd-l: LES very slow

 

Hi Norman,

 

One thing we have realized that may be important is that LES seems to be
performing poorly even on a single CPU, which may suggest a problem with
(our?) setup of LES in NAMD rather than network issues. For example if we
take our simulation system (28084 atoms) we are using for LES and run it on
a single CPU we get the following benchmark time:

 

Info: Benchmark time: 1 CPUs 1.16222 s/step 13.4516 days/ns 392.777 MB
memory

 

If we take the equivalent system with only one copy of the dye rather than
20 (24702 atoms in total) and run it without LES we get the following:

 

Info: Benchmark time: 1 CPUs 0.35795 s/step 2.07147 days/ns 211.812 MB
memory

 

So even on a single CPU, the use of LES is slowing the run down
considerably. The issues with the scaling on multiple CPUs may stem from the
same problem that causes this massive slow down on a single CPU.

 

To your network question, I asked the people running the supercomputer and
they said the nodes communicate via an infiniband connection.

 

 

 

2013/3/6 Norman Geist <norman.geist_at_uni-greifswald.de>

Hi again Siri,

 

which of these network connections do you use? If you do not know, watch at
the machinefile/nodelist from your queuing system, and try to find out which
network is used between the nodes when they resolve each other via the
hostnames there. The easiest way to do this, is to log on to one node, and
ping another node via the hostname used in the machinefile, this should
point out which ip answeres.

 

Norman Geist.

 

Von: Siri Søndergaard [mailto:siris2501_at_gmail.com]

Gesendet: Dienstag, 5. März 2013 08:14

An: Norman Geist
Betreff: Re: namd-l: LES very slow

 

Yes, I'm using a queing system.

 

The output is:

eth0 Link encap:Ethernet HWaddr 98:4B:E1:74:EE:0C

          inet addr:10.2.0.19 Bcast:10.2.255.255 Mask:255.255.0.0

          inet6 addr: fe80::9a4b:e1ff:fe74:ee0c/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

          RX packets:43247 errors:0 dropped:0 overruns:0 frame:0

          TX packets:16604 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:7073185 (6.7 MiB) TX bytes:2108682 (2.0 MiB)

          Interrupt:30 Memory:ec000000-ec012800

 

eth1 Link encap:Ethernet HWaddr 98:4B:E1:74:EE:0E

          BROADCAST MULTICAST MTU:1500 Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

          Interrupt:37 Memory:ea000000-ea012800

 

eth2 Link encap:Ethernet HWaddr 98:4B:E1:74:EE:24

          BROADCAST MULTICAST MTU:1500 Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

          Interrupt:31 Memory:f0000000-f0012800

 

eth3 Link encap:Ethernet HWaddr 98:4B:E1:74:EE:26

          BROADCAST MULTICAST MTU:1500 Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

          Interrupt:39 Memory:ee000000-ee012800

 

eth4 Link encap:Ethernet HWaddr 28:92:4A:D1:77:08

          inet addr:202.8.34.206 Bcast:202.8.34.223 Mask:255.255.255.224

          inet6 addr: fe80::2a92:4aff:fed1:7708/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

          RX packets:295569 errors:0 dropped:0 overruns:0 frame:0

          TX packets:291870 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:1545433802 (1.4 GiB) TX bytes:37439917 (35.7 MiB)

          Interrupt:67

 

eth5 Link encap:Ethernet HWaddr 28:92:4A:D1:77:0C

          BROADCAST MULTICAST MTU:1500 Metric:1

          RX packets:0 errors:0 dropped:0 overruns:0 frame:0

          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:1000

          RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)

          Interrupt:71

 

Ifconfig uses the ioctl access method to get the full address information,
which limits hardware addresses to 8 bytes.

Because Infiniband address has 20 bytes, only the first 8 bytes are
displayed correctly.

Ifconfig is obsolete! For replacement check ip.

ib0 Link encap:InfiniBand HWaddr
80:00:00:03:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00

 

          inet addr:172.16.0.19 Bcast:172.16.255.255 Mask:255.255.0.0

          inet6 addr: fe80::211:7500:79:4810/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

          RX packets:2770 errors:0 dropped:0 overruns:0 frame:0

          TX packets:32 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:256

          RX bytes:170764 (166.7 KiB) TX bytes:4539 (4.4 KiB)

 

Ifconfig uses the ioctl access method to get the full address information,
which limits hardware addresses to 8 bytes.

Because Infiniband address has 20 bytes, only the first 8 bytes are
displayed correctly.

Ifconfig is obsolete! For replacement check ip.

ib1 Link encap:InfiniBand HWaddr
80:00:00:05:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00

          inet addr:172.16.10.19 Bcast:172.16.255.255 Mask:255.255.0.0

          inet6 addr: fe80::211:7500:79:4811/64 Scope:Link

          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

          RX packets:2754 errors:0 dropped:0 overruns:0 frame:0

          TX packets:16 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:256

          RX bytes:169867 (165.8 KiB) TX bytes:1060 (1.0 KiB)

 

lo Link encap:Local Loopback

          inet addr:127.0.0.1 Mask:255.0.0.0

          inet6 addr: ::1/128 Scope:Host

          UP LOOPBACK RUNNING MTU:16436 Metric:1

          RX packets:28 errors:0 dropped:0 overruns:0 frame:0

          TX packets:28 errors:0 dropped:0 overruns:0 carrier:0

          collisions:0 txqueuelen:0

          RX bytes:2238 (2.1 KiB) TX bytes:2238 (2.1 KiB)

 

 

 

2013/3/5 Norman Geist <norman.geist_at_uni-greifswald.de>

Ok, so let’s try something. Could you please post the output of
“/sbin/ifconfig –a”. Also, are you using an queuing system for submitting
your jobs.

 

Regarding the output you posted yesterday, connected mode is ok but mtu
should be 65520 IMHO. If you are already using the right network, which I
want to find out with the upper questions, changing this setting should
improve scaling a lot.

 

Norman Geist.

 

Von: Siri Søndergaard [mailto:siris2501_at_gmail.com]

Gesendet: Montag, 4. März 2013 09:09

An: Norman Geist
Betreff: Re: namd-l: LES very slow

 

I don't know what infiniband is? I don't know about ibverbs mpi or ipoib
either but as the run file contains "mpirun namd2" I'm thinking the mpi
option?

 

The output of cat /sys/class/net/ib0/m* is: connected

 
1500

 

I really appreciate your help! Thanks a lot!

 

2013/3/4 Norman Geist <norman.geist_at_uni-greifswald.de>

Hi Siri,

 

did you use the infiniband in this test?

 

Well the numbers are not very accurate, but there shouldn’t be too much
difference here. Do you use a ibverbs mpi or ipoib ? What’s the output of
“cat /sys/class/net/ib0/m*”

 

Norman Geist.

 

Von: Siri Søndergaard [mailto:siris2501_at_gmail.com]

Gesendet: Donnerstag, 28. Februar 2013 23:24

An: Norman Geist
Betreff: Re: namd-l: LES very slow

 

This is the head of the log file:

Charm++> Running on MPI version: 2.1

Charm++> level of thread support used: MPI_THREAD_SINGLE (desired:
MPI_THREAD_SINGLE)

Charm++> Running on non-SMP mode

Converse/Charm++ Commit ID: v6.4.0-beta1-0-g5776d21

CharmLB> Load balancer assumes all CPUs are same.

Charm++> Running on 1 unique compute nodes (12-way SMP).

Charm++> cpu topology info is gathered in 0.001 seconds.

Info: NAMD 2.9 for Linux-x86_64-MPI

Info:

Info: Please visit http://www.ks.uiuc.edu/Research/namd/

 

With 2 cpus on one node the time is now 11-12days pr. ns.
With 1 cpu on two nodes the time is 13 days pr. ns.

1 cpu on one node is 18 days pr. ns today.

 

 

 

2013/2/28 Norman Geist <norman.geist_at_uni-greifswald.de>

Hi again Siri,

 

ok so your basic setup is ok. But what’s about LES. I just can’t imagine a
reason for this kind of simulation being limited in scaling. You are right
with the 255 copies, seems I had an older manual. Can we see an output of
this LES simulations (head). Also, as a quick test for your node
interconnect, could you try the following with your LES simulation:

 

1. 2 Cores @ 1 Node = 2 Processes

2. 2 Cores @ 2 nodes = 2 Processes

 

So we can see if using your network makes a big difference. Nevertheless,
the scaling on one node should be better.

 

Now the developers could jump in and tell if there are known scaling issues
when using LES.

 

Norman Geist.

 

Von: Siri Søndergaard [mailto:siris2501_at_gmail.com]
Gesendet: Mittwoch, 27. Februar 2013 23:22

10.2.1.19
An: Norman Geist
Betreff: Re: namd-l: LES very slow

 

Hi

 

The manual for NAMD 2.9 says up to 255 copies is supported.When I do a
normal simulation with 1 cpu and 12 cpus the simulation time is estimated to
be 2,2 and 0,4 days pr. ns, respectively. If I increase the number to 24 (2
nodes times 12 cpus) the estimated time is 0,2 days pr. ns. If I do the same
for the LES system I get no decrease in simulation time.

2013/2/27 Norman Geist <norman.geist_at_uni-greifswald.de>

Hi Siri,

 

so far, I couldn’t find a reason for your problem in your hardware. I don’t
know what LES is actually doing, but the manual tells that NAMD only
supports up to 15 copies.

Nevertheless, I can’t see a reason why this kind of computation should harm
the good scaling of namd. Does “normal” md scale better, so we can identify
if it is a general problem of your setup, or if it is due LES.

 

Regards

Norman Geist.

 

Von: Siri Søndergaard [mailto:siris2501_at_gmail.com]
Gesendet: Mittwoch, 27. Februar 2013 00:11
An: Norman Geist
Betreff: Re: namd-l: LES very slow

 

I've attached the files... I hope this is what you were looking for.

 

2013/2/26 Norman Geist <norman.geist_at_uni-greifswald.de>

Hi Siri,

 

to help you we could use some information about the hardware you use.
Approximating you use linux, please supply the output of the following
commands:

 

1. cat /proc/cpuinfo

2. lspci

 

This should be enough for the beginning.

 

PS: If not using linux, please give otherwise information about the hardware
you use.

 

Norman Geist.

 

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von Siri Søndergaard
Gesendet: Dienstag, 26. Februar 2013 01:00
An: namd-l_at_ks.uiuc.edu
Betreff: namd-l: LES very slow

 

Hi

 

I'm trying to run LES on a system of ~30.000 atoms. I'm using 20 copies of
each of two dyes attached to DNA. The problem is when I extend the
simulation to more than one cpu the scaling does not increase accordingly.
An increase from one to 12 cpus only gives a decrease in simulation time
from ~9 days to ~4 days pr. ns. Does anybody know how to solve this?

 

Best regards, Siri

 

 

 

 

 

 

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:23:02 CST