RE: SMP NAMD reports threads greater than physical cores, even when distributed to other nodes

From: Jim Phillips (jim_at_ks.uiuc.edu)
Date: Sat Dec 19 2015 - 21:39:50 CST

I assume the loopback address is excluded by this code:

  ... && ! (iface->ifa_flags & IFF_LOOPBACK)

It's the eth0 and ib0 addresses that trigger the hostname lookup, but even
our workstations have an extra virbr0 address so I suspect the hostname
address lookup is happening on most clusters now.

I've added this as a bug at
https://charm.cs.illinois.edu/redmine/issues/931

Jim

On Fri, 18 Dec 2015, Tom Coles wrote:

> Thanks for your reply again - following your suggestion of looking at the skt_my_ip() function showed me what was wrong.
>
> We have the node hostnames added to the 127.0.0.1 address in /etc/hosts, e.g:
> 127.0.0.1 localhost.localdomain localhost node001
>
> I think what happens is as follows:
> In skt_my_ip, the main loop correctly ignores the loopback 127.0.0.1 IP and finds that we have two others IP addresses per node (one for eth0 and one for ib0). As there is more than IP address, it ignores the output from the loop and instead tries to get a single address from skt_lookup_ip. However, this simply returns the first IP address for the hostname, without checking if it is the loopback IP or not...and it is the loopback IP in my case.
>
> I have resolved my problem by removing the node names from the 127.0.0.1 line in the /etc/hosts file, but perhaps it would be a good idea to modify skt_lookup_ip to ignore the loopback IP address if another is available?
>
> Thanks again,
> Tom
>
> -----Original Message-----
> From: Jim Phillips [mailto:jim_at_ks.uiuc.edu]
> Sent: 18 December 2015 15:38
> To: Tom Coles <tcoles_at_mit.edu>
> Cc: namd-l_at_ks.uiuc.edu
> Subject: RE: namd-l: SMP NAMD reports threads greater than physical cores, even when distributed to other nodes
>
>
> Thanks. It looks like physical node detection is failing on your setup:
>
> Charm++> Running on 1 unique compute nodes (8-way SMP).
> ..
> Info: Running on 28 processors, 4 nodes, 1 physical nodes.
>
> Physical node detection is based on the IP address of the node as determined by skt_my_ip() in charm/src/util/sockRoutines.c
>
> What do ifconfig and hostname return when run on your nodes?
>
> You may also want to try the released verbs-smp and ibverbs-smp binaries to see if there is something different about your build environment.
>
> Jim
>
>
> On Fri, 18 Dec 2015, Tom Coles wrote:
>
>> Thanks for replying quickly. It's important to stress that it appears to actually be launching the processes on the correct nodes, as I have logged in and checked them individually to confirm that namd2 is running. However, it prints the warning anyway, which concerns me in case e.g. the load balancer might not work correctly if it doesn't know where the threads actually are. It also says " Running on 28 processors, 4 nodes, 1 physical nodes" and "1 unique compute nodes", even though it is definitely running on all four physical nodes.
>>
>> I removed the ++cpus 8 from the nodelist file. I ran charmrun from a fifth node (called spl) with the nodelist below:
>> group main
>> node001
>> node002
>> node003
>> node004
>>
>> The full startup phase of the logfile is here:
>>
>> Charmrun> scalable start enabled.
>> Charmrun> charmrun started...
>> Charmrun> using /bigdisk/tcoles/benchmark/NAMD4913/nodelist as
>> Charmrun> nodesfile adding client 0: "node001", IP:172.16.0.1 adding
>> Charmrun> client 1: "node001", IP:172.16.0.1 adding client 2:
>> Charmrun> "node001", IP:172.16.0.1 adding client 3: "node001",
>> Charmrun> IP:172.16.0.1 adding client 4: "node001", IP:172.16.0.1
>> Charmrun> adding client 5: "node001", IP:172.16.0.1 adding client 6:
>> Charmrun> "node001", IP:172.16.0.1 adding client 7: "node002",
>> Charmrun> IP:172.16.0.2 adding client 8: "node002", IP:172.16.0.2
>> Charmrun> adding client 9: "node002", IP:172.16.0.2 adding client 10:
>> Charmrun> "node002", IP:172.16.0.2 adding client 11: "node002",
>> Charmrun> IP:172.16.0.2 adding client 12: "node002", IP:172.16.0.2
>> Charmrun> adding client 13: "node002", IP:172.16.0.2 adding client 14:
>> Charmrun> "node003", IP:172.16.0.3 adding client 15: "node003",
>> Charmrun> IP:172.16.0.3 adding client 16: "node003", IP:172.16.0.3
>> Charmrun> adding client 17: "node003", IP:172.16.0.3 adding client 18:
>> Charmrun> "node003", IP:172.16.0.3 adding client 19: "node003",
>> Charmrun> IP:172.16.0.3 adding client 20: "node003", IP:172.16.0.3
>> Charmrun> adding client 21: "node004", IP:172.16.0.4 adding client 22:
>> Charmrun> "node004", IP:172.16.0.4 adding client 23: "node004",
>> Charmrun> IP:172.16.0.4 adding client 24: "node004", IP:172.16.0.4
>> Charmrun> adding client 25: "node004", IP:172.16.0.4 adding client 26:
>> Charmrun> "node004", IP:172.16.0.4 adding client 27: "node004",
>> Charmrun> IP:172.16.0.4 Charmrun = spl, port = 34245 IBVERBS version
>> Charmrun> of charmrun
>> start_nodes_rsh
>> Charmrun> Sending "0 spl 34245 14063 0" to client 0.
>> Charmrun> find the node program "/home/tcoles/bin/namd2" at "/bigdisk/tcoles/benchmark/NAMD4913" for 0.
>> Charmrun> Starting ssh node001 -l tcoles -o
>> Charmrun> KbdInteractiveAuthentication=no -o PasswordAuthentication=no
>> Charmrun> -o NoHostAuthenticationForLocalhost=yes /bin/bash -f remote shell (node001:0) started Sending "1 spl 34245 14063 0" to client 1.
>> Charmrun> find the node program "/home/tcoles/bin/namd2" at "/bigdisk/tcoles/benchmark/NAMD4913" for 7.
>> Charmrun> Starting ssh node002 -l tcoles -o
>> Charmrun> KbdInteractiveAuthentication=no -o PasswordAuthentication=no
>> Charmrun> -o NoHostAuthenticationForLocalhost=yes /bin/bash -f remote shell (node002:7) started Sending "2 spl 34245 14063 0" to client 2.
>> Charmrun> find the node program "/home/tcoles/bin/namd2" at "/bigdisk/tcoles/benchmark/NAMD4913" for 14.
>> Charmrun> Starting ssh node003 -l tcoles -o
>> Charmrun> KbdInteractiveAuthentication=no -o PasswordAuthentication=no
>> Charmrun> -o NoHostAuthenticationForLocalhost=yes /bin/bash -f remote shell (node003:14) started Sending "3 spl 34245 14063 0" to client 3.
>> Charmrun> find the node program "/home/tcoles/bin/namd2" at "/bigdisk/tcoles/benchmark/NAMD4913" for 21.
>> Charmrun> Starting ssh node004 -l tcoles -o
>> Charmrun> KbdInteractiveAuthentication=no -o PasswordAuthentication=no
>> Charmrun> -o NoHostAuthenticationForLocalhost=yes /bin/bash -f remote
>> Charmrun> shell (node004:21) started node programs all started
>> Charmrun remote shell(node003.14)> remote responding...
>> Charmrun remote shell(node004.21)> remote responding...
>> Charmrun remote shell(node002.7)> remote responding...
>> Charmrun remote shell(node001.0)> remote responding...
>> Charmrun remote shell(node004.21)> starting node-program...
>> Charmrun remote shell(node003.14)> starting node-program...
>> Charmrun remote shell(node002.7)> starting node-program...
>> Charmrun remote shell(node001.0)> starting node-program...
>> Charmrun remote shell(node001.0)> remote shell phase successful.
>> Charmrun remote shell(node004.21)> remote shell phase successful.
>> Charmrun remote shell(node003.14)> remote shell phase successful.
>> Charmrun remote shell(node002.7)> remote shell phase successful.
>> Charmrun> Waiting for 0-th client to connect.
>> Charmrun> Waiting for 1-th client to connect.
>> Charmrun> Waiting for 2-th client to connect.
>> Charmrun> Waiting for 3-th client to connect.
>> Charmrun> All clients connected.
>> Charmrun> IP tables sent.
>> Charmrun> node programs all connected
>> Charmrun> started all node programs in 1.108 seconds.
>> Charm++> Running in SMP mode: numNodes 4, 7 worker threads per
>> Charm++> process The comm. thread both sends and receives messages
>> Charm++> Using recursive bisection (scheme 3) for topology aware
>> Charm++> partitions
>> Converse/Charm++ Commit ID:
>> v6.7.0-rc1-19-g6f8f9d4-namd-charm-6.7.0-build-2015-Dec-04-47243
>> Charm++> scheduler running in netpoll mode.
>> CharmLB> Load balancer assumes all CPUs are same.
>> Charm++> Running on 1 unique compute nodes (8-way SMP).
>> Charm++> cpu topology info is gathered in 0.003 seconds.
>>
>> Charm++> Warning: the number of SMP threads (32) is greater than the number of physical cores (8), so threads will sleep while idling. Use +CmiSpinOnIdle or +CmiSleepOnIdle to control this directly.
>>
>> Info: NAMD 2.11b2 for Linux-x86_64-verbs-smp
>> Info:
>> Info: Please visit http://www.ks.uiuc.edu/Research/namd/
>> Info: for updates, documentation, and support information.
>> Info:
>> Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
>> Info: in all publications reporting results obtained with NAMD.
>> Info:
>> Info: Based on Charm++/Converse 60601 for
>> verbs-linux-x86_64-gfortran-smp-gcc
>> Info: Built Thu Dec 17 10:21:50 EST 2015 by tcoles on spl
>> Info: 1 NAMD 2.11b2 Linux-x86_64-verbs-smp 28 node001 tcoles
>> Info: Running on 28 processors, 4 nodes, 1 physical nodes.
>> Info: CPU topology information available.
>> Info: Charm++/Converse parallel runtime startup completed at 0.012979
>> s CkLoopLib is used in SMP with a simple dynamic scheduling
>> (converse-level notification) but not using node-level queue
>> Info: 643.699 MB of memory in use based on /proc/self/stat
>> Info: Configuration file is namdInput.txt
>> Info: Working in the current directory
>> /bigdisk/tcoles/benchmark/NAMD4913
>> TCL: Suspending until startup complete.
>> Info: SIMULATION PARAMETERS:
>> Info: TIMESTEP 0.1
>> Info: NUMBER OF STEPS 500
>> Info: STEPS PER CYCLE 100
>> Info: LOAD BALANCER Centralized
>> Info: LOAD BALANCING STRATEGY New Load Balancers -- DEFAULT
>> Info: LDB PERIOD 20000 steps
>> Info: FIRST LDB TIMESTEP 500
>> Info: LAST LDB TIMESTEP -1
>> Info: LDB BACKGROUND SCALING 1
>> Info: HOM BACKGROUND SCALING 1
>> Info: MIN ATOMS PER PATCH 40
>> Info: INITIAL TEMPERATURE 300
>> Info: CENTER OF MASS MOVING INITIALLY? NO
>> Info: DIELECTRIC 1
>> Info: EXCLUDE SCALED ONE-FOUR
>> Info: 1-4 ELECTROSTATICS SCALED BY 0.833333
>> Info: MODIFIED 1-4 VDW PARAMETERS WILL BE USED
>> Info: NO DCD TRAJECTORY OUTPUT
>> Info: NO EXTENDED SYSTEM TRAJECTORY OUTPUT
>> Info: NO VELOCITY DCD OUTPUT
>> Info: NO FORCE DCD OUTPUT
>> Info: OUTPUT FILENAME TestOutput
>> Info: BINARY OUTPUT FILES WILL BE USED
>> Info: NO RESTART FILE
>> Info: CUTOFF 10
>> Info: PAIRLIST DISTANCE 10
>> Info: PAIRLIST SHRINK RATE 0.01
>> Info: PAIRLIST GROW RATE 0.01
>> Info: PAIRLIST TRIGGER 0.3
>> Info: PAIRLISTS PER CYCLE 2
>> Info: REQUIRING 1000 PROCESSORS FOR PAIRLISTS
>> Info: PAIRLISTS DISABLED
>> Info: MARGIN 0
>> Info: HYDROGEN GROUP CUTOFF 2.5
>> Info: PATCH DIMENSION 12.5
>> Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
>> Info: TIMING OUTPUT STEPS 500
>> Info: USING VERLET I (r-RESPA) MTS SCHEME.
>> Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
>> Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
>> Info: NONBONDED FORCES EVALUATED EVERY 10 STEPS
>> Info: RANDOM NUMBER SEED 12345
>> Info: USE HYDROGEN BONDS? NO
>> Info: COORDINATE PDB 4913AfterStep3.pdb
>> Info: STRUCTURE FILE 4913molXplor.psf
>> Info: PARAMETER file: XPLOR format! (default)
>> Info: PARAMETERS namdParam_EMI_BF4_Liu.params
>> Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
>> Info: SUMMARY OF PARAMETERS:
>> Info: 10 BONDS
>> Info: 15 ANGLES
>> Info: 20 DIHEDRAL
>> Info: 0 IMPROPER
>> Info: 0 CROSSTERM
>> Info: 10 VDW
>> Info: 0 VDW_PAIRS
>> Info: 0 NBTHOLE_PAIRS
>> Info: TIME FOR READING PSF FILE: 0.932249
>> Info: TIME FOR READING PDB FILE: 0.257703
>> Info:
>> Info: ****************************
>> Info: STRUCTURE SUMMARY:
>> Info: 117912 ATOMS
>> Info: 112999 BONDS
>> Info: 191607 ANGLES
>> Info: 225998 DIHEDRALS
>> Info: 0 IMPROPERS
>> Info: 0 CROSSTERMS
>> Info: 0 EXCLUSIONS
>> Info: 353733 DEGREES OF FREEDOM
>> Info: 63869 HYDROGEN GROUPS
>> Info: 4 ATOMS IN LARGEST HYDROGEN GROUP
>> Info: 63869 MIGRATION GROUPS
>> Info: 4 ATOMS IN LARGEST MIGRATION GROUP
>> Info: TOTAL MASS = 972622 amu
>> Info: TOTAL CHARGE = 0.000553646 e
>> Info: *****************************
>> Info:
>> Info: Entering startup at 1.23359 s, 708.359 MB of memory in use
>> Info: Startup phase 0 took 0.00176811 s, 708.359 MB of memory in use
>> Info: ADDED 481474 IMPLICIT EXCLUSIONS
>> Info: Startup phase 1 took 0.386808 s, 772.359 MB of memory in use
>> Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
>> Info: NONBONDED TABLE SIZE: 705 POINTS
>> Info: INCONSISTENCY IN FAST TABLE ENERGY VS FORCE: 0.000290475 AT
>> 0.251946
>> Info: ABSOLUTE IMPRECISION IN VDWA TABLE FORCE: 1.00974e-28 AT 9.99687
>> Info: INCONSISTENCY IN VDWA TABLE ENERGY VS FORCE: 0.0040507 AT
>> 0.251946
>> Info: ABSOLUTE IMPRECISION IN VDWB TABLE FORCE: 6.2204e-22 AT 9.99687
>> Info: INCONSISTENCY IN VDWB TABLE ENERGY VS FORCE: 0.00150189 AT
>> 0.251946
>> Info: Startup phase 2 took 0.000987053 s, 772.359 MB of memory in use
>> Info: Startup phase 3 took 0.000255108 s, 773.352 MB of memory in use
>> Info: Startup phase 4 took 0.00227189 s, 773.352 MB of memory in use
>> Info: Startup phase 5 took 0.000298977 s, 773.352 MB of memory in use
>> Info: PATCH GRID IS 9 BY 7 BY 10
>> Info: PATCH GRID IS 1-AWAY BY 1-AWAY BY 1-AWAY
>> Info: REMOVING COM VELOCITY 0.00197023 0.0268866 -0.00385198
>> Info: LARGEST PATCH (607) HAS 368 ATOMS
>> Info: TORUS A SIZE 28 USING 0
>> Info: TORUS B SIZE 1 USING 0
>> Info: TORUS C SIZE 1 USING 0
>> Info: TORUS MINIMAL MESH SIZE IS 1 BY 1 BY 1
>> Info: Placed 100% of base nodes on same physical node as patch
>> Info: Startup phase 6 took 0.125689 s, 773.352 MB of memory in use
>> Info: Startup phase 7 took 0.00393891 s, 773.352 MB of memory in use
>> Info: Startup phase 8 took 0.012778 s, 773.48 MB of memory in use
>> LDB: Central LB being created...
>> Info: Startup phase 9 took 0.00310898 s, 773.48 MB of memory in use
>> Info: CREATING 10255 COMPUTE OBJECTS
>> Info: useSync: 0 useProxySync: 0
>> Info: Startup phase 10 took 0.020287 s, 774.512 MB of memory in use
>> Info: Building spanning tree ... send: 1 recv: 0 with branch factor 4
>> Info: Startup phase 11 took 0.0013299 s, 774.512 MB of memory in use
>> Info: Startup phase 12 took 0.0002141 s, 775 MB of memory in use
>> Info: Finished startup at 1.79333 s, 775.129 MB of memory in use
>>
>> Thanks for your help.
>>
>> Tom
>>
>> -----Original Message-----
>> From: Jim Phillips [mailto:jim_at_ks.uiuc.edu]
>> Sent: 18 December 2015 01:32
>> To: Tom Coles <tcoles_at_mit.edu>
>> Subject: Re: namd-l: SMP NAMD reports threads greater than physical
>> cores, even when distributed to other nodes
>>
>>
>> Try removing ++cpus 8 from the nodelist file. I'm guessing it is launching all four processes on first node.
>>
>> Can you send the full log file?
>>
>> Jim
>>
>>
>> On Thu, 17 Dec 2015, Tom Coles wrote:
>>
>>> I am trying to run NAMD in SMP mode with ibverbs. I have tried versions 2.10 and 2.11b2, but it always reports that the total number of threads is greater than the number of physical cores, even though I am asking it to place the threads on different nodes. In fact, this happens when the node that is running namd2 is not in the nodelist and does not receive any PE threads.
>>>
>>> The following message is printed:
>>> Warning: the number of SMP threads (32) is greater than the number of physical cores (8), so threads will sleep while idling. Use +CmiSpinOnIdle or +CmiSleepOnIdle to control this directly.
>>>
>>>
>>> I have four nodes, 8 cores per node, and I know that I need to leave on core per node free for the communications thread. I have 8 full cores each, no HT.
>>>
>>> The command line is:
>>> charmrun namd2 +p28 ++ppn 7 ++nodelist mynodelist ++verbose namdInput
>>>
>>> The mynodelist file contains each node listed only once:
>>> group main
>>> host node001 ++shell ssh ++cpus 8
>>> host node002 ++shell ssh ++cpus 8
>>> host node003 ++shell ssh ++cpus 8
>>> host node004 ++shell ssh ++cpus 8
>>>
>>> The verbose output confirms that it is connected to all four nodes and I have connected to them with ssh and used ps to confirm that namd2 is running in each case.
>>> Charmrun> adding client 0: "node001", IP:127.0.0.1 adding client 1:
>>> Charmrun> "node001", IP:127.0.0.1 adding client 2: "node001",
>>> Charmrun> IP:127.0.0.1 adding client 3: "node001", IP:127.0.0.1
>>> Charmrun> adding client 4: "node001", IP:127.0.0.1 adding client 5:
>>> Charmrun> "node001", IP:127.0.0.1 adding client 6: "node001",
>>> Charmrun> IP:127.0.0.1 adding client 7: "node002", IP:172.16.0.2
>>> Charmrun> adding client 8: "node002", IP:172.16.0.2 adding client 9:
>>> Charmrun> "node002", IP:172.16.0.2 adding client 10: "node002",
>>> Charmrun> IP:172.16.0.2 adding client 11: "node002", IP:172.16.0.2
>>> Charmrun> adding client 12: "node002", IP:172.16.0.2 adding client 13:
>>> Charmrun> "node002", IP:172.16.0.2 adding client 14: "node003",
>>> Charmrun> IP:172.16.0.3 adding client 15: "node003", IP:172.16.0.3
>>> Charmrun> adding client 16: "node003", IP:172.16.0.3 adding client 17:
>>> Charmrun> "node003", IP:172.16.0.3 adding client 18: "node003",
>>> Charmrun> IP:172.16.0.3 adding client 19: "node003", IP:172.16.0.3
>>> Charmrun> adding client 20: "node003", IP:172.16.0.3 adding client 21:
>>> Charmrun> "node004", IP:172.16.0.4 adding client 22: "node004",
>>> Charmrun> IP:172.16.0.4 adding client 23: "node004", IP:172.16.0.4
>>> Charmrun> adding client 24: "node004", IP:172.16.0.4 adding client 25:
>>> Charmrun> "node004", IP:172.16.0.4 adding client 26: "node004",
>>> Charmrun> IP:172.16.0.4 adding client 27: "node004", IP:172.16.0.4
>>>
>>> Please can you let me know if I am doing something wrong? I am also concerned that 28 clients are added - is it correct that it needs to add one client per thread (rather than per process) like this?
>>>
>>> I wonder if there might be a bug, as I have attempted to run the command from a fifth node (not in the nodelist) and the same message has been printed, even though no threads are assigned on that node! I have confirmed that nothing is actually running on that node by looking at the top command - there is no significant activity from namd2.
>>>
>>> Thanks for any help,
>>> Tom Coles
>>>
>>
>

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:22:20 CST