AW: AW: AW: Using nodelist file causes namd to hang

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Wed Apr 09 2014 - 07:09:53 CDT

Yes there is a limit. It's I guess in /etc/ssh/sshd_config its called
"MaxStartups". You may need to restart sshd deamon.

Norman Geist.

> -----Ursprüngliche Nachricht-----
> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
> Auftrag von Douglas Houston
> Gesendet: Mittwoch, 9. April 2014 13:17
> An: Norman Geist
> Cc: Namd Mailing List
> Betreff: Re: AW: AW: namd-l: Using nodelist file causes namd to hang
>
> We may be getting somewhere. The following command now runs:
>
> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/charmrun +p1
> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2 ++verbose
> mdrun.conf
>
> With options +p1 to +p10 it works. At +p11 or +p12 (each node has 12
> processors) I get:
>
> Charmrun> charmrun started...
> Charmrun> using /usr/people/douglas/.nodelist as nodesfile
> Charmrun> adding client 0: "localhost", IP:127.0.0.1
> Charmrun> adding client 1: "localhost", IP:127.0.0.1
> Charmrun> adding client 2: "localhost", IP:127.0.0.1
> Charmrun> adding client 3: "localhost", IP:127.0.0.1
> Charmrun> adding client 4: "localhost", IP:127.0.0.1
> Charmrun> adding client 5: "localhost", IP:127.0.0.1
> Charmrun> adding client 6: "localhost", IP:127.0.0.1
> Charmrun> adding client 7: "localhost", IP:127.0.0.1
> Charmrun> adding client 8: "localhost", IP:127.0.0.1
> Charmrun> adding client 9: "localhost", IP:127.0.0.1
> Charmrun> adding client 10: "localhost", IP:127.0.0.1
> Charmrun> Charmrun = 129.215.237.187, port = 58561
> start_nodes_rsh
> Charmrun> Sending "0 129.215.237.187 58561 2981 0" to client 0.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 0.
> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> Charmrun> remote shell (localhost:0) started
> Charmrun> Sending "1 129.215.237.187 58561 2981 0" to client 1.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 1.
> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> Charmrun> remote shell (localhost:1) started
> Charmrun> Sending "2 129.215.237.187 58561 2981 0" to client 2.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 2.
> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> Charmrun> remote shell (localhost:2) started
> Charmrun> Sending "3 129.215.237.187 58561 2981 0" to client 3.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 3.
> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> Charmrun> remote shell (localhost:3) started
> Charmrun> Sending "4 129.215.237.187 58561 2981 0" to client 4.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 4.
> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> Charmrun> remote shell (localhost:4) started
> Charmrun> Sending "5 129.215.237.187 58561 2981 0" to client 5.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 5.
> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> Charmrun> remote shell (localhost:5) started
> Charmrun> Sending "6 129.215.237.187 58561 2981 0" to client 6.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 6.
> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> Charmrun> remote shell (localhost:6) started
> Charmrun> Sending "7 129.215.237.187 58561 2981 0" to client 7.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 7.
> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> Charmrun> remote shell (localhost:7) started
> Charmrun> Sending "8 129.215.237.187 58561 2981 0" to client 8.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 8.
> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> Charmrun> remote shell (localhost:8) started
> Charmrun> Sending "9 129.215.237.187 58561 2981 0" to client 9.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 9.
> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> Charmrun> remote shell (localhost:9) started
> Charmrun> Sending "10 129.215.237.187 58561 2981 0" to client 10.
> Charmrun> find the node program
> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> " for
> 10.
> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> Charmrun> remote shell (localhost:10) started
> Charmrun> node programs all started
> ssh_exchange_identification: Connection closed by remote host
> Charmrun remote shell(localhost.7)> remote responding...
> Charmrun remote shell(localhost.7)> starting node-program...
> Charmrun remote shell(localhost.7)> rsh phase successful.
> Charmrun remote shell(localhost.5)> remote responding...
> Charmrun remote shell(localhost.3)> remote responding...
> Charmrun remote shell(localhost.5)> starting node-program...
> Charmrun remote shell(localhost.5)> rsh phase successful.
> Charmrun remote shell(localhost.10)> remote responding...
> Charmrun remote shell(localhost.3)> starting node-program...
> Charmrun remote shell(localhost.3)> rsh phase successful.
> Charmrun remote shell(localhost.10)> starting node-program...
> Charmrun remote shell(localhost.10)> rsh phase successful.
> Charmrun remote shell(localhost.6)> remote responding...
> Charmrun remote shell(localhost.0)> remote responding...
> Charmrun remote shell(localhost.6)> starting node-program...
> Charmrun remote shell(localhost.6)> rsh phase successful.
> Charmrun remote shell(localhost.9)> remote responding...
> Charmrun remote shell(localhost.0)> starting node-program...
> Charmrun remote shell(localhost.0)> rsh phase successful.
> Charmrun remote shell(localhost.8)> remote responding...
> Charmrun remote shell(localhost.4)> remote responding...
> Charmrun remote shell(localhost.9)> starting node-program...
> Charmrun remote shell(localhost.9)> rsh phase successful.
> Charmrun remote shell(localhost.4)> starting node-program...
> Charmrun remote shell(localhost.4)> rsh phase successful.
> Charmrun remote shell(localhost.8)> starting node-program...
> Charmrun remote shell(localhost.8)> rsh phase successful.
> Charmrun remote shell(localhost.1)> remote responding...
> Charmrun remote shell(localhost.1)> starting node-program...
> Charmrun remote shell(localhost.1)> rsh phase successful.
> Charmrun> Error 255 returned from rsh (localhost:2)
>
>
>
> Note the number in "localhost:#" in the last line above is variable,
> it's not the same each time. Is there a limit on how many simultaneous
> connections I can have?
>
>
>
>
> Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on Wed, 9 Apr
> 2014 12:53:50 +0200:
>
> > This may be a hint. Your nodes must not only be able to logon to all
> nodes
> > without password, but should also be able to logon to themselves via
> own IP
> > address, localhost and 127.0.0.1
> >
> > You may want to delete the wrong entries in ~/.ssh/known_hosts on the
> nodes,
> > and recreate by ssh to the targets mentioned above.
> >
> > Norman Geist.
> >
> >
> >> -----Ursprüngliche Nachricht-----
> >> Von: Douglas Houston [mailto:DouglasR.Houston_at_ed.ac.uk]
> >> Gesendet: Mittwoch, 9. April 2014 12:42
> >> An: Norman Geist
> >> Betreff: Re: AW: namd-l: Using nodelist file causes namd to hang
> >>
> >> The same command without the ++local causes the nodelist file to be
> >> used, I have already posed the output from this.
> >>
> >> If I delete the nodelist file, the same command without the ++local
> >> (which causes the file /usr/people/douglas/.nodelist to be used)
> >> outputs:
> >>
> >>
> >> Charmrun> charmrun started...
> >> Charmrun> using /usr/people/douglas/.nodelist as nodesfile
> >> Charmrun> adding client 0: "localhost", IP:127.0.0.1
> >> Charmrun> Charmrun = 129.215.237.187, port = 35909
> >> start_nodes_rsh
> >> Charmrun> Sending "0 129.215.237.187 35909 27843 0" to client 0.
> >> Charmrun> find the node program
> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> " for
> >> 0.
> >> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
> >> Charmrun> remote shell (localhost:0) started
> >> Charmrun> node programs all started
> >> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> >> @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
> >> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
> >> IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
> >> Someone could be eavesdropping on you right now (man-in-the-middle
> >> attack)!
> >> It is also possible that the RSA host key has just been changed.
> >> The fingerprint for the RSA key sent by the remote host is
> >> 99:cb:e0:0a:77:8b:61:fd:19:01:57:93:ec:93:99:63.
> >> Please contact your system administrator.
> >> Add correct host key in /usr/people/douglas/.ssh/known_hosts to get
> >> rid of this message.
> >> Offending key in /usr/people/douglas/.ssh/known_hosts:47
> >> RSA host key for localhost has changed and you have requested strict
> >> checking.
> >> Host key verification failed.
> >> Charmrun> Error 255 returned from rsh (localhost:0)
> >>
> >>
> >> The file /usr/people/douglas/.nodelist contains:
> >> group main
> >> host localhost
> >>
> >>
> >>
> >>
> >>
> >> Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on Wed, 9 Apr
> >> 2014 12:28:51 +0200:
> >>
> >> > Please try the same command without ++local and see if it still
> >> works.
> >> >
> >> >> -----Ursprüngliche Nachricht-----
> >> >> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu]
> Im
> >> >> Auftrag von Douglas Houston
> >> >> Gesendet: Mittwoch, 9. April 2014 11:49
> >> >> An: ramya narasimhan
> >> >> Cc: Namd Mailing List
> >> >> Betreff: Re: namd-l: Using nodelist file causes namd to hang
> >> >>
> >> >> The result is the same whichever order the nodes are present in
> the
> >> >> list.
> >> >>
> >> >> What exactly is Charmrun waiting for at the "Waiting for 0-th
> client
> >> >> to connect." stage? Presumably the 0th client is the first in
> >> >> nodelist, and that a process is supposed to start on that node,
> then
> >> >> "connect" to Charmrun on the host machine?
> >> >
> >> > Charmrun is just spawning the namd processes and now is waiting
> for
> >> them to
> >> > start to talk.
> >> >
> >> >>
> >> >> Using the command top I see no evidence of anything new starting
> on
> >> >> the node, despite all the "starting node-program" and "rsh phase
> >> >> successful" messages that are output.
> >> >>
> >> >> Using "ps -u douglas" on the node shows a whole bunch of tcsh and
> sh
> >> >> shells and sleep processes starting then dying but nothing else.
> >> >>
> >> >> What does the line "Sending "0 129.215.237.187 57453 26737 0" to
> >> >> client 0" mean? How is this "sending" achieved? I see "port
> 57453"
> >> is
> >> >> mentioned in the output ...
> >> >
> >> > Seems like being part of the parallel startup, where the spawned
> >> processes
> >> > get the information about each other.
> >> >
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Quoting ramya narasimhan <ramya_jln_at_yahoo.co.in> on Wed, 9 Apr
> 2014
> >> >> 11:51:52 +0800 (SGT):
> >> >>
> >> >> > Just change the hostname [IP of the system] order in the
> >> >> > nodefile, so that the 0-th client will be itioc5 instead
> >> of itioc1.
> >> >> > To find whether the problem is with nodes.
> >> >> >
> >> >> >
> >> >> > Dr. Ramya.L.
> >> >> > On Tuesday, 8 April 2014 7:23 PM, Douglas Houston
> >> >> > <DouglasR.Houston_at_ed.ac.uk> wrote:
> >> >> >
> >> >> > Yes, with ping all the nodes resolve to full hostnames and IP
> >> >> > addresses. I tried putting IP addresses into nodelist instead
> of
> >> >> > hostnames but it still times out at "Waiting for 0-th client to
> >> >> connect"
> >> >> >
> >> >> >
> >> >> > Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on Tue, 8
> >> Apr
> >> >> > 2014 14:30:15 +0200:
> >> >> >
> >> >> >> On all the nodes? Otherwise try a nodelist with IP adresses
> >> instead
> >> >> of
> >> >> >> hostnames. If that works, you got a problem with local DNS.
> >> >> >>
> >> >> >> Norman Geist.
> >> >> >>
> >> >> >>
> >> >> >>> -----Ursprüngliche Nachricht-----
> >> >> >>> Von: Douglas Houston [mailto:DouglasR.Houston_at_ed.ac.uk]
> >> >> >>> Gesendet: Dienstag, 8. April 2014 14:14
> >> >> >>> An: Norman Geist
> >> >> >>> Cc: Namd Mailing List
> >> >> >>> Betreff: Re: AW: AW: namd-l: Using nodelist file causes namd
> to
> >> >> hang
> >> >> >>>
> >> >> >>> Thanks Norman. I had found that thread after my searches but
> it
> >> did
> >> >> >>> not seem to apply to my problem.
> >> >> >>>
> >> >> >>> "You can check this while doing a ping to the hostname, while
> >> you
> >> >> are
> >> >> >>> logged in at a compute node "ping hostname". If this returns
> an
> >> >> >>> 127.x.x.x address, your local DNS configuration is not
> suitable
> >> for
> >> >> >>> charmrun"
> >> >> >>>
> >> >> >>> My ping returns the full name and IP address of the node, not
> >> >> >>> 127.x.x.x.
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on Tue,
> 8
> >> Apr
> >> >> >>> 2014 13:22:41 +0200:
> >> >> >>>
> >> >> >>> > Now I remember that I already posted a solution for this
> some
> >> >> weeks
> >> >> >>> ago, you
> >> >> >>> > could have found it by using google.de. Maybe this helps
> you.
> >> >> >>> >
> >> >> >>> > http://www.ks.uiuc.edu/Research/namd/mailing_list/namd-
> l.2012-
> >> >> >>> 2013/2645.html
> >> >> >>> >
> >> >> >>> > Norman Geist.
> >> >> >>> >
> >> >> >>> >
> >> >> >>> >> -----Ursprüngliche Nachricht-----
> >> >> >>> >> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-
> >> l_at_ks.uiuc.edu]
> >> >> Im
> >> >> >>> >> Auftrag von Douglas Houston
> >> >> >>> >> Gesendet: Dienstag, 8. April 2014 12:53
> >> >> >>> >> An: Norman Geist
> >> >> >>> >> Cc: Namd Mailing List
> >> >> >>> >> Betreff: Re: AW: namd-l: Using nodelist file causes namd
> to
> >> hang
> >> >> >>> >>
> >> >> >>> >> Thanks for the tip Norman, but if I change my command to
> the
> >> >> >>> following
> >> >> >>> >> it still hangs at the same point:
> >> >> >>> >>
> >> >> >>> >> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/charmrun
> +p12
> >> >> >>> >> ++remote-shell ssh
> >> >> >>> >> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2
> >> ++verbose
> >> >> >>> >> mdrun.conf
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on
> Tue,
> >> 8
> >> >> Apr
> >> >> >>> >> 2014 12:06:03 +0200:
> >> >> >>> >>
> >> >> >>> >> > Try the charmrun option "++remote-shell ssh".
> >> >> >>> >> >
> >> >> >>> >> > Norman Geist.
> >> >> >>> >> >
> >> >> >>> >> >> -----Ursprüngliche Nachricht-----
> >> >> >>> >> >> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-
> >> >> l_at_ks.uiuc.edu]
> >> >> >>> Im
> >> >> >>> >> >> Auftrag von Douglas Houston
> >> >> >>> >> >> Gesendet: Dienstag, 8. April 2014 11:30
> >> >> >>> >> >> An: namd-l_at_ks.uiuc.edu
> >> >> >>> >> >> Betreff: namd-l: Using nodelist file causes namd to
> hang
> >> >> >>> >> >>
> >> >> >>> >> >> I have two nodes connected via ethernet: itioc5 and
> itioc1
> >> >> >>> >> >>
> >> >> >>> >> >> I have the following in my nodelist file:
> >> >> >>> >> >>
> >> >> >>> >> >> group main
> >> >> >>> >> >> host itioc1
> >> >> >>> >> >> host itioc5
> >> >> >>> >> >>
> >> >> >>> >> >> I am using the following command:
> >> >> >>> >> >>
> >> >> >>> >> >> /usr/people/douglas/programs/NAMD_2.9_Linux-
> x86/charmrun
> >> +p12
> >> >> >>> >> >> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2
> >> >> ++verbose
> >> >> >>> >> >> mdrun.conf
> >> >> >>> >> >>
> >> >> >>> >> >> I get the following output:
> >> >> >>> >> >>
> >> >> >>> >> >> Charmrun> charmrun started...
> >> >> >>> >> >> Charmrun> using ./nodelist as nodesfile
> >> >> >>> >> >> Charmrun> adding client 0: "itioc1", IP:129.215.137.21
> >> >> >>> >> >> Charmrun> adding client 1: "itioc5", IP:129.215.237.186
> >> >> >>> >> >> Charmrun> adding client 2: "itioc1", IP:129.215.137.21
> >> >> >>> >> >> Charmrun> adding client 3: "itioc5", IP:129.215.237.186
> >> >> >>> >> >> Charmrun> adding client 4: "itioc1", IP:129.215.137.21
> >> >> >>> >> >> Charmrun> adding client 5: "itioc5", IP:129.215.237.186
> >> >> >>> >> >> Charmrun> adding client 6: "itioc1", IP:129.215.137.21
> >> >> >>> >> >> Charmrun> adding client 7: "itioc5", IP:129.215.237.186
> >> >> >>> >> >> Charmrun> adding client 8: "itioc1", IP:129.215.137.21
> >> >> >>> >> >> Charmrun> adding client 9: "itioc5", IP:129.215.237.186
> >> >> >>> >> >> Charmrun> adding client 10: "itioc1", IP:129.215.137.21
> >> >> >>> >> >> Charmrun> adding client 11: "itioc5",
> IP:129.215.237.186
> >> >> >>> >> >> Charmrun> Charmrun = 129.215.237.187, port = 58330
> >> >> >>> >> >> start_nodes_rsh
> >> >> >>> >> >> Charmrun> Sending "0 129.215.237.187 58330 19205 0" to
> >> client
> >> >> 0.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 0.
> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc1:0) started
> >> >> >>> >> >> Charmrun> Sending "1 129.215.237.187 58330 19205 0" to
> >> client
> >> >> 1.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 1.
> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc5:1) started
> >> >> >>> >> >> Charmrun> Sending "2 129.215.237.187 58330 19205 0" to
> >> client
> >> >> 2.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 2.
> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc1:2) started
> >> >> >>> >> >> Charmrun> Sending "3 129.215.237.187 58330 19205 0" to
> >> client
> >> >> 3.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 3.
> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc5:3) started
> >> >> >>> >> >> Charmrun> Sending "4 129.215.237.187 58330 19205 0" to
> >> client
> >> >> 4.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 4.
> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc1:4) started
> >> >> >>> >> >> Charmrun> Sending "5 129.215.237.187 58330 19205 0" to
> >> client
> >> >> 5.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 5.
> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc5:5) started
> >> >> >>> >> >> Charmrun> Sending "6 129.215.237.187 58330 19205 0" to
> >> client
> >> >> 6.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 6.
> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc1:6) started
> >> >> >>> >> >> Charmrun> Sending "7 129.215.237.187 58330 19205 0" to
> >> client
> >> >> 7.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 7.
> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc5:7) started
> >> >> >>> >> >> Charmrun> Sending "8 129.215.237.187 58330 19205 0" to
> >> client
> >> >> 8.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 8.
> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc1:8) started
> >> >> >>> >> >> Charmrun> Sending "9 129.215.237.187 58330 19205 0" to
> >> client
> >> >> 9.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 9.
> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc5:9) started
> >> >> >>> >> >> Charmrun> Sending "10 129.215.237.187 58330 19205 0" to
> >> >> client
> >> >> >>> 10.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 10.
> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc1:10) started
> >> >> >>> >> >> Charmrun> Sending "11 129.215.237.187 58330 19205 0" to
> >> >> client
> >> >> >>> 11.
> >> >> >>> >> >> Charmrun> find the node program
> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2"
> at
> >> >> >>> >> >>
> >> >> >>> >>
> >> >> >>>
> >> >>
> >>
> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
> >> >> >>> >> >> " for
> >> >> >>> >> >> 11.
> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
> >> >> >>> >> >> Charmrun> remote shell (itioc5:11) started
> >> >> >>> >> >> Charmrun> node programs all started
> >> >> >>> >> >> Charmrun remote shell(itioc5.3)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc5.5)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc5.3)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc5.5)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc5.3)> rsh phase successful.
> >> >> >>> >> >> Charmrun remote shell(itioc5.5)> rsh phase successful.
> >> >> >>> >> >> Charmrun remote shell(itioc5.9)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc5.7)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc5.11)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc5.1)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc5.9)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc5.7)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc5.9)> rsh phase successful.
> >> >> >>> >> >> Charmrun remote shell(itioc5.7)> rsh phase successful.
> >> >> >>> >> >> Charmrun remote shell(itioc5.11)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc5.1)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc5.11)> rsh phase successful.
> >> >> >>> >> >> Charmrun remote shell(itioc5.1)> rsh phase successful.
> >> >> >>> >> >> Charmrun remote shell(itioc1.10)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc1.0)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc1.4)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc1.10)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc1.10)> rsh phase successful.
> >> >> >>> >> >> Charmrun remote shell(itioc1.0)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc1.0)> rsh phase successful.
> >> >> >>> >> >> Charmrun remote shell(itioc1.4)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc1.4)> rsh phase successful.
> >> >> >>> >> >> Charmrun remote shell(itioc1.2)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc1.6)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc1.8)> remote responding...
> >> >> >>> >> >> Charmrun remote shell(itioc1.2)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc1.2)> rsh phase successful.
> >> >> >>> >> >> Charmrun remote shell(itioc1.6)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc1.6)> rsh phase successful.
> >> >> >>> >> >> Charmrun remote shell(itioc1.8)> starting node-
> program...
> >> >> >>> >> >> Charmrun remote shell(itioc1.8)> rsh phase successful.
> >> >> >>> >> >> Charmrun> Waiting for 0-th client to connect.
> >> >> >>> >> >> Charmrun> error 0 attaching to node:
> >> >> >>> >> >> Timeout waiting for node-program to connect
> >> >> >>> >> >>
> >> >> >>> >> >>
> >> >> >>> >> >> I'm not sure but I think the "Starting ssh itioc5 -l
> >> douglas
> >> >> >>> /bin/sh
> >> >> >>> >> >> -f" lines has something to do with it. If I run the
> >> command
> >> >> "ssh
> >> >> >>> >> >> itioc5 -l douglas /bin/sh -f" it also hangs. If I run
> "ssh
> >> >> itioc5
> >> >> >>> -l
> >> >> >>> >> >> douglas" then it logs me in just fine (without asking
> for
> >> a
> >> >> >>> >> password).
> >> >> >>> >> >> Similarly the command "ssh itioc5 -l douglas -f pwd"
> works
> >> >> fine,
> >> >> >>> >> with
> >> >> >>> >> >> the expected directory name returned.
> >> >> >>> >> >>
> >> >> >>> >> >> What exactly is happening at the "Waiting for 0-th
> client
> >> to
> >> >> >>> >> connect."
> >> >> >>> >> >> stage?
> >> >> >>> >> >>
> >> >> >>> >> >> Many thanks in advance for your thoughts.
> >> >> >>> >> >>
> >> >> >>> >> >> cheers,
> >> >> >>> >> >>
> >> >> >>> >> >> Doug
> >> >> >>> >> >>
> >> >> >>> >> >> _____________________________________________________
> >> >> >>> >> >> Dr. Douglas R. Houston
> >> >> >>> >> >> Lecturer
> >> >> >>> >> >> Institute of Structural and Molecular Biology
> >> >> >>> >> >> Room 3.23, Michael Swann Building
> >> >> >>> >> >> King's Buildings
> >> >> >>> >> >> University of Edinburgh
> >> >> >>> >> >> Edinburgh, EH9 3JR, UK
> >> >> >>> >> >> Tel. 0131 650 7358
> >> >> >>> >> >> http://tinyurl.com/douglasrhouston
> >> >> >>> >> >>
> >> >> >>> >> >> --
> >> >> >>> >> >> The University of Edinburgh is a charitable body,
> >> registered
> >> >> in
> >> >> >>> >> >> Scotland, with registration number SC005336.
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> > ---
> >> >> >>> >> > Diese E-Mail ist frei von Viren und Malware, denn der
> >> avast!
> >> >> >>> >> > Antivirus Schutz ist aktiv.
> >> >> >>> >> > http://www.avast.com
> >> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >>
> >> >> >>> >> _____________________________________________________
> >> >> >>> >> Dr. Douglas R. Houston
> >> >> >>> >> Lecturer
> >> >> >>> >> Institute of Structural and Molecular Biology
> >> >> >>> >> Room 3.23, Michael Swann Building
> >> >> >>> >> King's Buildings
> >> >> >>> >> University of Edinburgh
> >> >> >>> >> Edinburgh, EH9 3JR, UK
> >> >> >>> >> Tel. 0131 650 7358
> >> >> >>> >> http://tinyurl.com/douglasrhouston
> >> >> >>> >>
> >> >> >>> >> --
> >> >> >>> >> The University of Edinburgh is a charitable body,
> registered
> >> in
> >> >> >>> >> Scotland, with registration number SC005336.
> >> >> >>> >
> >> >> >>> >
> >> >> >>> >
> >> >> >>> > ---
> >> >> >>> > Diese E-Mail ist frei von Viren und Malware, denn der
> avast!
> >> >> >>> > Antivirus Schutz ist aktiv.
> >> >> >>> > http://www.avast.com
> >> >> >>> >
> >> >> >>> >
> >> >> >>> >
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>>
> >> >> >>> _____________________________________________________
> >> >> >>> Dr. Douglas R. Houston
> >> >> >>> Lecturer
> >> >> >>> Institute of Structural and Molecular Biology
> >> >> >>> Room 3.23, Michael Swann Building
> >> >> >>> King's Buildings
> >> >> >>> University of Edinburgh
> >> >> >>> Edinburgh, EH9 3JR, UK
> >> >> >>> Tel. 0131 650 7358
> >> >> >>> http://tinyurl.com/douglasrhouston
> >> >> >>>
> >> >> >>> --
> >> >> >>> The University of Edinburgh is a charitable body, registered
> in
> >> >> >>> Scotland, with registration number SC005336.
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> ---
> >> >> >> Diese E-Mail ist frei von Viren und Malware, denn der avast!
> >> >> >> Antivirus Schutz ist aktiv.
> >> >> >> http://www.avast.com
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >> > _____________________________________________________
> >> >> > Dr. Douglas R. Houston
> >> >> > Lecturer
> >> >> > Institute of Structural and Molecular Biology
> >> >> > Room 3.23, Michael Swann Building
> >> >> > King's Buildings
> >> >> > University of Edinburgh
> >> >> > Edinburgh, EH9 3JR, UK
> >> >> > Tel. 0131 650 7358
> >> >> > http://tinyurl.com/douglasrhouston
> >> >> >
> >> >> > --
> >> >> > The University of Edinburgh is a charitable body, registered in
> >> >> > Scotland, with registration number SC005336.
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> _____________________________________________________
> >> >> Dr. Douglas R. Houston
> >> >> Lecturer
> >> >> Institute of Structural and Molecular Biology
> >> >> Room 3.23, Michael Swann Building
> >> >> King's Buildings
> >> >> University of Edinburgh
> >> >> Edinburgh, EH9 3JR, UK
> >> >> Tel. 0131 650 7358
> >> >> http://tinyurl.com/douglasrhouston
> >> >>
> >> >> --
> >> >> The University of Edinburgh is a charitable body, registered in
> >> >> Scotland, with registration number SC005336.
> >> >
> >> >
> >> >
> >> > ---
> >> > Diese E-Mail ist frei von Viren und Malware, denn der avast!
> >> > Antivirus Schutz ist aktiv.
> >> > http://www.avast.com
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >>
> >> _____________________________________________________
> >> Dr. Douglas R. Houston
> >> Lecturer
> >> Institute of Structural and Molecular Biology
> >> Room 3.23, Michael Swann Building
> >> King's Buildings
> >> University of Edinburgh
> >> Edinburgh, EH9 3JR, UK
> >> Tel. 0131 650 7358
> >> http://tinyurl.com/douglasrhouston
> >>
> >> --
> >> The University of Edinburgh is a charitable body, registered in
> >> Scotland, with registration number SC005336.
> >
> >
> >
> > ---
> > Diese E-Mail ist frei von Viren und Malware, denn der avast!
> > Antivirus Schutz ist aktiv.
> > http://www.avast.com
> >
> >
> >
>
>
>
>
> _____________________________________________________
> Dr. Douglas R. Houston
> Lecturer
> Institute of Structural and Molecular Biology
> Room 3.23, Michael Swann Building
> King's Buildings
> University of Edinburgh
> Edinburgh, EH9 3JR, UK
> Tel. 0131 650 7358
> http://tinyurl.com/douglasrhouston
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.

---
Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus Schutz ist aktiv.
http://www.avast.com

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:20:41 CST