AW: Using nodelist file causes namd to hang

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Tue Apr 08 2014 - 05:06:03 CDT

Try the charmrun option "++remote-shell ssh".

Norman Geist.

> -----Ursprüngliche Nachricht-----
> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
> Auftrag von Douglas Houston
> Gesendet: Dienstag, 8. April 2014 11:30
> An: namd-l_at_ks.uiuc.edu
> Betreff: namd-l: Using nodelist file causes namd to hang
>
> I have two nodes connected via ethernet: itioc5 and itioc1
>
> I have the following in my nodelist file:
>
> group main
> host itioc1
> host itioc5
>
> I am using the following command:
>
> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/charmrun +p12
> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2 ++verbose
> mdrun.conf
>
> I get the following output:
>
> Charmrun> charmrun started...
> Charmrun> using ./nodelist as nodesfile
> Charmrun> adding client 0: "itioc1", IP:129.215.137.21
> Charmrun> adding client 1: "itioc5", IP:129.215.237.186
> Charmrun> adding client 2: "itioc1", IP:129.215.137.21
> Charmrun> adding client 3: "itioc5", IP:129.215.237.186
> Charmrun> adding client 4: "itioc1", IP:129.215.137.21
> Charmrun> adding client 5: "itioc5", IP:129.215.237.186
> Charmrun> adding client 6: "itioc1", IP:129.215.137.21
> Charmrun> adding client 7: "itioc5", IP:129.215.237.186
> Charmrun> adding client 8: "itioc1", IP:129.215.137.21
> Charmrun> adding client 9: "itioc5", IP:129.215.237.186
> Charmrun> adding client 10: "itioc1", IP:129.215.137.21
> Charmrun> adding client 11: "itioc5", IP:129.215.237.186
> Charmrun> Charmrun = 129.215.237.187, port = 58330
> start_nodes_rsh
> Charmrun> Sending "0 129.215.237.187 58330 19205 0" to client 0.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 0.
> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc1:0) started
> Charmrun> Sending "1 129.215.237.187 58330 19205 0" to client 1.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 1.
> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc5:1) started
> Charmrun> Sending "2 129.215.237.187 58330 19205 0" to client 2.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 2.
> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc1:2) started
> Charmrun> Sending "3 129.215.237.187 58330 19205 0" to client 3.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 3.
> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc5:3) started
> Charmrun> Sending "4 129.215.237.187 58330 19205 0" to client 4.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 4.
> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc1:4) started
> Charmrun> Sending "5 129.215.237.187 58330 19205 0" to client 5.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 5.
> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc5:5) started
> Charmrun> Sending "6 129.215.237.187 58330 19205 0" to client 6.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 6.
> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc1:6) started
> Charmrun> Sending "7 129.215.237.187 58330 19205 0" to client 7.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 7.
> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc5:7) started
> Charmrun> Sending "8 129.215.237.187 58330 19205 0" to client 8.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 8.
> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc1:8) started
> Charmrun> Sending "9 129.215.237.187 58330 19205 0" to client 9.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 9.
> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc5:9) started
> Charmrun> Sending "10 129.215.237.187 58330 19205 0" to client 10.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 10.
> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc1:10) started
> Charmrun> Sending "11 129.215.237.187 58330 19205 0" to client 11.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 11.
> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> Charmrun> remote shell (itioc5:11) started
> Charmrun> node programs all started
> Charmrun remote shell(itioc5.3)> remote responding...
> Charmrun remote shell(itioc5.5)> remote responding...
> Charmrun remote shell(itioc5.3)> starting node-program...
> Charmrun remote shell(itioc5.5)> starting node-program...
> Charmrun remote shell(itioc5.3)> rsh phase successful.
> Charmrun remote shell(itioc5.5)> rsh phase successful.
> Charmrun remote shell(itioc5.9)> remote responding...
> Charmrun remote shell(itioc5.7)> remote responding...
> Charmrun remote shell(itioc5.11)> remote responding...
> Charmrun remote shell(itioc5.1)> remote responding...
> Charmrun remote shell(itioc5.9)> starting node-program...
> Charmrun remote shell(itioc5.7)> starting node-program...
> Charmrun remote shell(itioc5.9)> rsh phase successful.
> Charmrun remote shell(itioc5.7)> rsh phase successful.
> Charmrun remote shell(itioc5.11)> starting node-program...
> Charmrun remote shell(itioc5.1)> starting node-program...
> Charmrun remote shell(itioc5.11)> rsh phase successful.
> Charmrun remote shell(itioc5.1)> rsh phase successful.
> Charmrun remote shell(itioc1.10)> remote responding...
> Charmrun remote shell(itioc1.0)> remote responding...
> Charmrun remote shell(itioc1.4)> remote responding...
> Charmrun remote shell(itioc1.10)> starting node-program...
> Charmrun remote shell(itioc1.10)> rsh phase successful.
> Charmrun remote shell(itioc1.0)> starting node-program...
> Charmrun remote shell(itioc1.0)> rsh phase successful.
> Charmrun remote shell(itioc1.4)> starting node-program...
> Charmrun remote shell(itioc1.4)> rsh phase successful.
> Charmrun remote shell(itioc1.2)> remote responding...
> Charmrun remote shell(itioc1.6)> remote responding...
> Charmrun remote shell(itioc1.8)> remote responding...
> Charmrun remote shell(itioc1.2)> starting node-program...
> Charmrun remote shell(itioc1.2)> rsh phase successful.
> Charmrun remote shell(itioc1.6)> starting node-program...
> Charmrun remote shell(itioc1.6)> rsh phase successful.
> Charmrun remote shell(itioc1.8)> starting node-program...
> Charmrun remote shell(itioc1.8)> rsh phase successful.
> Charmrun> Waiting for 0-th client to connect.
> Charmrun> error 0 attaching to node:
> Timeout waiting for node-program to connect
>
>
> I'm not sure but I think the "Starting ssh itioc5 -l douglas /bin/sh
> -f" lines has something to do with it. If I run the command "ssh
> itioc5 -l douglas /bin/sh -f" it also hangs. If I run "ssh itioc5 -l
> douglas" then it logs me in just fine (without asking for a password).
> Similarly the command "ssh itioc5 -l douglas -f pwd" works fine, with
> the expected directory name returned.
>
> What exactly is happening at the "Waiting for 0-th client to connect."
> stage?
>
> Many thanks in advance for your thoughts.
>
> cheers,
>
> Doug
>
> _____________________________________________________
> Dr. Douglas R. Houston
> Lecturer
> Institute of Structural and Molecular Biology
> Room 3.23, Michael Swann Building
> King's Buildings
> University of Edinburgh
> Edinburgh, EH9 3JR, UK
> Tel. 0131 650 7358
> http://tinyurl.com/douglasrhouston
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.

---
Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz ist aktiv.
http://www.avast.com

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:20:40 CST