From: Douglas Houston (DouglasR.Houston_at_ed.ac.uk)
Date: Tue Apr 08 2014 - 07:13:42 CDT
Thanks Norman. I had found that thread after my searches but it did
not seem to apply to my problem.
"You can check this while doing a ping to the hostname, while you are
logged in at a compute node "ping hostname". If this returns an
127.x.x.x address, your local DNS configuration is not suitable for
charmrun"
My ping returns the full name and IP address of the node, not 127.x.x.x.
Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on Tue, 8 Apr
2014 13:22:41 +0200:
> Now I remember that I already posted a solution for this some weeks ago, you
> could have found it by using google.de. Maybe this helps you.
>
> http://www.ks.uiuc.edu/Research/namd/mailing_list/namd-l.2012-2013/2645.html
>
> Norman Geist.
>
>
>> -----Ursprüngliche Nachricht-----
>> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
>> Auftrag von Douglas Houston
>> Gesendet: Dienstag, 8. April 2014 12:53
>> An: Norman Geist
>> Cc: Namd Mailing List
>> Betreff: Re: AW: namd-l: Using nodelist file causes namd to hang
>>
>> Thanks for the tip Norman, but if I change my command to the following
>> it still hangs at the same point:
>>
>> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/charmrun +p12
>> ++remote-shell ssh
>> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2 ++verbose
>> mdrun.conf
>>
>>
>>
>> Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on Tue, 8 Apr
>> 2014 12:06:03 +0200:
>>
>> > Try the charmrun option "++remote-shell ssh".
>> >
>> > Norman Geist.
>> >
>> >> -----Ursprüngliche Nachricht-----
>> >> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
>> >> Auftrag von Douglas Houston
>> >> Gesendet: Dienstag, 8. April 2014 11:30
>> >> An: namd-l_at_ks.uiuc.edu
>> >> Betreff: namd-l: Using nodelist file causes namd to hang
>> >>
>> >> I have two nodes connected via ethernet: itioc5 and itioc1
>> >>
>> >> I have the following in my nodelist file:
>> >>
>> >> group main
>> >> host itioc1
>> >> host itioc5
>> >>
>> >> I am using the following command:
>> >>
>> >> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/charmrun +p12
>> >> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2 ++verbose
>> >> mdrun.conf
>> >>
>> >> I get the following output:
>> >>
>> >> Charmrun> charmrun started...
>> >> Charmrun> using ./nodelist as nodesfile
>> >> Charmrun> adding client 0: "itioc1", IP:129.215.137.21
>> >> Charmrun> adding client 1: "itioc5", IP:129.215.237.186
>> >> Charmrun> adding client 2: "itioc1", IP:129.215.137.21
>> >> Charmrun> adding client 3: "itioc5", IP:129.215.237.186
>> >> Charmrun> adding client 4: "itioc1", IP:129.215.137.21
>> >> Charmrun> adding client 5: "itioc5", IP:129.215.237.186
>> >> Charmrun> adding client 6: "itioc1", IP:129.215.137.21
>> >> Charmrun> adding client 7: "itioc5", IP:129.215.237.186
>> >> Charmrun> adding client 8: "itioc1", IP:129.215.137.21
>> >> Charmrun> adding client 9: "itioc5", IP:129.215.237.186
>> >> Charmrun> adding client 10: "itioc1", IP:129.215.137.21
>> >> Charmrun> adding client 11: "itioc5", IP:129.215.237.186
>> >> Charmrun> Charmrun = 129.215.237.187, port = 58330
>> >> start_nodes_rsh
>> >> Charmrun> Sending "0 129.215.237.187 58330 19205 0" to client 0.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 0.
>> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc1:0) started
>> >> Charmrun> Sending "1 129.215.237.187 58330 19205 0" to client 1.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 1.
>> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc5:1) started
>> >> Charmrun> Sending "2 129.215.237.187 58330 19205 0" to client 2.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 2.
>> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc1:2) started
>> >> Charmrun> Sending "3 129.215.237.187 58330 19205 0" to client 3.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 3.
>> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc5:3) started
>> >> Charmrun> Sending "4 129.215.237.187 58330 19205 0" to client 4.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 4.
>> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc1:4) started
>> >> Charmrun> Sending "5 129.215.237.187 58330 19205 0" to client 5.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 5.
>> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc5:5) started
>> >> Charmrun> Sending "6 129.215.237.187 58330 19205 0" to client 6.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 6.
>> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc1:6) started
>> >> Charmrun> Sending "7 129.215.237.187 58330 19205 0" to client 7.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 7.
>> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc5:7) started
>> >> Charmrun> Sending "8 129.215.237.187 58330 19205 0" to client 8.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 8.
>> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc1:8) started
>> >> Charmrun> Sending "9 129.215.237.187 58330 19205 0" to client 9.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 9.
>> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc5:9) started
>> >> Charmrun> Sending "10 129.215.237.187 58330 19205 0" to client 10.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 10.
>> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc1:10) started
>> >> Charmrun> Sending "11 129.215.237.187 58330 19205 0" to client 11.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> " for
>> >> 11.
>> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc5:11) started
>> >> Charmrun> node programs all started
>> >> Charmrun remote shell(itioc5.3)> remote responding...
>> >> Charmrun remote shell(itioc5.5)> remote responding...
>> >> Charmrun remote shell(itioc5.3)> starting node-program...
>> >> Charmrun remote shell(itioc5.5)> starting node-program...
>> >> Charmrun remote shell(itioc5.3)> rsh phase successful.
>> >> Charmrun remote shell(itioc5.5)> rsh phase successful.
>> >> Charmrun remote shell(itioc5.9)> remote responding...
>> >> Charmrun remote shell(itioc5.7)> remote responding...
>> >> Charmrun remote shell(itioc5.11)> remote responding...
>> >> Charmrun remote shell(itioc5.1)> remote responding...
>> >> Charmrun remote shell(itioc5.9)> starting node-program...
>> >> Charmrun remote shell(itioc5.7)> starting node-program...
>> >> Charmrun remote shell(itioc5.9)> rsh phase successful.
>> >> Charmrun remote shell(itioc5.7)> rsh phase successful.
>> >> Charmrun remote shell(itioc5.11)> starting node-program...
>> >> Charmrun remote shell(itioc5.1)> starting node-program...
>> >> Charmrun remote shell(itioc5.11)> rsh phase successful.
>> >> Charmrun remote shell(itioc5.1)> rsh phase successful.
>> >> Charmrun remote shell(itioc1.10)> remote responding...
>> >> Charmrun remote shell(itioc1.0)> remote responding...
>> >> Charmrun remote shell(itioc1.4)> remote responding...
>> >> Charmrun remote shell(itioc1.10)> starting node-program...
>> >> Charmrun remote shell(itioc1.10)> rsh phase successful.
>> >> Charmrun remote shell(itioc1.0)> starting node-program...
>> >> Charmrun remote shell(itioc1.0)> rsh phase successful.
>> >> Charmrun remote shell(itioc1.4)> starting node-program...
>> >> Charmrun remote shell(itioc1.4)> rsh phase successful.
>> >> Charmrun remote shell(itioc1.2)> remote responding...
>> >> Charmrun remote shell(itioc1.6)> remote responding...
>> >> Charmrun remote shell(itioc1.8)> remote responding...
>> >> Charmrun remote shell(itioc1.2)> starting node-program...
>> >> Charmrun remote shell(itioc1.2)> rsh phase successful.
>> >> Charmrun remote shell(itioc1.6)> starting node-program...
>> >> Charmrun remote shell(itioc1.6)> rsh phase successful.
>> >> Charmrun remote shell(itioc1.8)> starting node-program...
>> >> Charmrun remote shell(itioc1.8)> rsh phase successful.
>> >> Charmrun> Waiting for 0-th client to connect.
>> >> Charmrun> error 0 attaching to node:
>> >> Timeout waiting for node-program to connect
>> >>
>> >>
>> >> I'm not sure but I think the "Starting ssh itioc5 -l douglas /bin/sh
>> >> -f" lines has something to do with it. If I run the command "ssh
>> >> itioc5 -l douglas /bin/sh -f" it also hangs. If I run "ssh itioc5 -l
>> >> douglas" then it logs me in just fine (without asking for a
>> password).
>> >> Similarly the command "ssh itioc5 -l douglas -f pwd" works fine,
>> with
>> >> the expected directory name returned.
>> >>
>> >> What exactly is happening at the "Waiting for 0-th client to
>> connect."
>> >> stage?
>> >>
>> >> Many thanks in advance for your thoughts.
>> >>
>> >> cheers,
>> >>
>> >> Doug
>> >>
>> >> _____________________________________________________
>> >> Dr. Douglas R. Houston
>> >> Lecturer
>> >> Institute of Structural and Molecular Biology
>> >> Room 3.23, Michael Swann Building
>> >> King's Buildings
>> >> University of Edinburgh
>> >> Edinburgh, EH9 3JR, UK
>> >> Tel. 0131 650 7358
>> >> http://tinyurl.com/douglasrhouston
>> >>
>> >> --
>> >> The University of Edinburgh is a charitable body, registered in
>> >> Scotland, with registration number SC005336.
>> >
>> >
>> >
>> > ---
>> > Diese E-Mail ist frei von Viren und Malware, denn der avast!
>> > Antivirus Schutz ist aktiv.
>> > http://www.avast.com
>> >
>> >
>> >
>>
>>
>>
>>
>> _____________________________________________________
>> Dr. Douglas R. Houston
>> Lecturer
>> Institute of Structural and Molecular Biology
>> Room 3.23, Michael Swann Building
>> King's Buildings
>> University of Edinburgh
>> Edinburgh, EH9 3JR, UK
>> Tel. 0131 650 7358
>> http://tinyurl.com/douglasrhouston
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>
>
>
> ---
> Diese E-Mail ist frei von Viren und Malware, denn der avast!
> Antivirus Schutz ist aktiv.
> http://www.avast.com
>
>
>
_____________________________________________________
Dr. Douglas R. Houston
Lecturer
Institute of Structural and Molecular Biology
Room 3.23, Michael Swann Building
King's Buildings
University of Edinburgh
Edinburgh, EH9 3JR, UK
Tel. 0131 650 7358
http://tinyurl.com/douglasrhouston
-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:22:18 CST