From: Douglas Houston (DouglasR.Houston_at_ed.ac.uk)
Date: Wed Jun 18 2014 - 07:57:59 CDT
itioc1 and 2 are in a different physical location to 3, 4 5 and 6, and
presumably this means they're on a different subnet. Does this mean
they can't be used?
Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on Wed, 18 Jun
2014 14:52:29 +0200:
> Look the line with the <----
>
>> >> Charmrun> adding client 0: "itioc3", IP:129.215.237.179
>> >> Charmrun> adding client 1: "itioc4", IP:129.215.237.180
>> >> Charmrun> adding client 2: "itioc5", IP:129.215.237.186
>> >> Charmrun> adding client 3: "itioc6", IP:129.215.237.187
>> >> Charmrun> adding client 4: "itioc1", IP:129.215.137.21 <-----
>> >> Charmrun> adding client 5: "itioc2", IP:129.215.137.123 <-----
>> >> Charmrun> adding client 6: "itioc3", IP:129.215.237.179
>> >> Charmrun> adding client 7: "itioc4", IP:129.215.237.180
>
> This nodes don't seem to use the same network as the other nodes!
> Something is definitely weird with your network config.
>
> Also:
>
>> 129.215.137.123 itioc2.bch.ed.ac.uk itioc2
>> 129.215.137.21 n3
>
> Is again different from what could be seen above. (137/237)
> This doesn't look continues and might be an error.
>
>
> Norman Geist.
>
>> -----Ursprüngliche Nachricht-----
>> Von: Douglas Houston [mailto:DouglasR.Houston_at_ed.ac.uk]
>> Gesendet: Mittwoch, 18. Juni 2014 14:24
>> An: Norman Geist
>> Cc: Namd Mailing List
>> Betreff: Re: AW: AW: namd-l: Using nodelist file causes namd to hang
>>
>> Here is an example of what /etc/hosts contains:
>>
>> # Do not remove the following line, or various programs
>> # that require network functionality will fail.
>> 127.0.0.1 localhost.localdomain localhost
>> 129.215.137.123 itioc2.bch.ed.ac.uk itioc2
>> ::1 localhost6.localdomain6 localhost6
>> 129.215.137.21 n3
>>
>> I'm not sure I can see anything wrong with it?
>>
>>
>>
>> Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on Mon, 16 Jun
>> 2014 09:19:28 +0200:
>>
>> > This may be related to an unsuitable local dns setup. Please check
>> that in
>> > all the nodes "/etc/hosts" the hostname of the node does not point to
>> a
>> > loopback address similar to 127.0.0.1 but too the outgoing IP. I've
>> written
>> > another thread about that somewhen.
>> >
>> > Norman Geist.
>> >
>> >> -----Ursprüngliche Nachricht-----
>> >> Von: Douglas Houston [mailto:DouglasR.Houston_at_ed.ac.uk]
>> >> Gesendet: Freitag, 13. Juni 2014 19:17
>> >> An: Norman Geist
>> >> Cc: Namd Mailing List
>> >> Betreff: Re: AW: namd-l: Using nodelist file causes namd to hang
>> >>
>> >> Hi Norman,
>> >>
>> >> I have made some progress, I now get:
>> >>
>> >> [douglas_at_itioc1 200ns]$
>> >> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/charmrun +p8
>> >> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2 ++verbose
>> >> mdrun.conf
>> >> Charmrun> charmrun started...
>> >> Charmrun> using ./nodelist as nodesfile
>> >> Charmrun> adding client 0: "itioc3", IP:129.215.237.179
>> >> Charmrun> adding client 1: "itioc4", IP:129.215.237.180
>> >> Charmrun> adding client 2: "itioc5", IP:129.215.237.186
>> >> Charmrun> adding client 3: "itioc6", IP:129.215.237.187
>> >> Charmrun> adding client 4: "itioc1", IP:129.215.137.21
>> >> Charmrun> adding client 5: "itioc2", IP:129.215.137.123
>> >> Charmrun> adding client 6: "itioc3", IP:129.215.237.179
>> >> Charmrun> adding client 7: "itioc4", IP:129.215.237.180
>> >> Charmrun> Charmrun = 129.215.137.21, port = 54043
>> >> start_nodes_rsh
>> >> Charmrun> Sending "0 129.215.137.21 54043 24199 0" to client 0.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/unfold_MD_Cks1pep
>> >> _par36_Skp2complex/200ns" for
>> >> 0.
>> >> Charmrun> Starting ssh itioc3 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc3:0) started
>> >> Charmrun> Sending "1 129.215.137.21 54043 24199 0" to client 1.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/unfold_MD_Cks1pep
>> >> _par36_Skp2complex/200ns" for
>> >> 1.
>> >> Charmrun> Starting ssh itioc4 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc4:1) started
>> >> Charmrun> Sending "2 129.215.137.21 54043 24199 0" to client 2.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/unfold_MD_Cks1pep
>> >> _par36_Skp2complex/200ns" for
>> >> 2.
>> >> Charmrun> Starting ssh itioc5 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc5:2) started
>> >> Charmrun> Sending "3 129.215.137.21 54043 24199 0" to client 3.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/unfold_MD_Cks1pep
>> >> _par36_Skp2complex/200ns" for
>> >> 3.
>> >> Charmrun> Starting ssh itioc6 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc6:3) started
>> >> Charmrun> Sending "4 129.215.137.21 54043 24199 0" to client 4.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/unfold_MD_Cks1pep
>> >> _par36_Skp2complex/200ns" for
>> >> 4.
>> >> Charmrun> Starting ssh itioc1 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc1:4) started
>> >> Charmrun> Sending "5 129.215.137.21 54043 24199 0" to client 5.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/unfold_MD_Cks1pep
>> >> _par36_Skp2complex/200ns" for
>> >> 5.
>> >> Charmrun> Starting ssh itioc2 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc2:5) started
>> >> Charmrun> Sending "6 129.215.137.21 54043 24199 0" to client 6.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/unfold_MD_Cks1pep
>> >> _par36_Skp2complex/200ns" for
>> >> 6.
>> >> Charmrun> Starting ssh itioc3 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc3:6) started
>> >> Charmrun> Sending "7 129.215.137.21 54043 24199 0" to client 7.
>> >> Charmrun> find the node program
>> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/unfold_MD_Cks1pep
>> >> _par36_Skp2complex/200ns" for
>> >> 7.
>> >> Charmrun> Starting ssh itioc4 -l douglas /bin/sh -f
>> >> Charmrun> remote shell (itioc4:7) started
>> >> Charmrun> node programs all started
>> >> Charmrun remote shell(itioc3.6)> remote responding...
>> >> Charmrun remote shell(itioc3.0)> remote responding...
>> >> Charmrun remote shell(itioc3.6)> starting node-program...
>> >> Charmrun remote shell(itioc3.0)> starting node-program...
>> >> Charmrun remote shell(itioc3.6)> rsh phase successful.
>> >> Charmrun remote shell(itioc3.0)> rsh phase successful.
>> >> Charmrun remote shell(itioc4.1)> remote responding...
>> >> Charmrun remote shell(itioc4.1)> starting node-program...
>> >> Charmrun remote shell(itioc4.1)> rsh phase successful.
>> >> Charmrun remote shell(itioc4.7)> remote responding...
>> >> Charmrun remote shell(itioc4.7)> starting node-program...
>> >> Charmrun remote shell(itioc4.7)> rsh phase successful.
>> >> Charmrun remote shell(itioc1.4)> remote responding...
>> >> Charmrun remote shell(itioc1.4)> starting node-program...
>> >> Charmrun remote shell(itioc1.4)> rsh phase successful.
>> >> Charmrun remote shell(itioc6.3)> remote responding...
>> >> Charmrun remote shell(itioc6.3)> starting node-program...
>> >> Charmrun remote shell(itioc6.3)> rsh phase successful.
>> >> Charmrun remote shell(itioc5.2)> remote responding...
>> >> Charmrun remote shell(itioc5.2)> starting node-program...
>> >> Charmrun remote shell(itioc5.2)> rsh phase successful.
>> >> Charmrun remote shell(itioc2.5)> remote responding...
>> >> Charmrun remote shell(itioc2.5)> starting node-program...
>> >> Charmrun remote shell(itioc2.5)> rsh phase successful.
>> >> Charmrun> Waiting for 0-th client to connect.
>> >> Charmrun> Waiting for 1-th client to connect.
>> >> Charmrun> Waiting for 2-th client to connect.
>> >> Charmrun> Waiting for 3-th client to connect.
>> >> Charmrun> Waiting for 4-th client to connect.
>> >> Charmrun> Waiting for 5-th client to connect.
>> >> Charmrun> client 0 connected (IP=129.215.237.179 data_port=45304)
>> >> Charmrun> client 6 connected (IP=129.215.237.179 data_port=54685)
>> >> Charmrun> client 4 connected (IP=129.215.137.21 data_port=49908)
>> >> Charmrun> client 5 connected (IP=129.215.137.123 data_port=40205)
>> >> Charmrun> client 1 connected (IP=129.215.237.180 data_port=47847)
>> >> Charmrun> client 7 connected (IP=129.215.237.180 data_port=45521)
>> >> Charmrun> Waiting for 6-th client to connect.
>> >> Charmrun> client 2 connected (IP=129.215.237.186 data_port=52855)
>> >> Charmrun> Waiting for 7-th client to connect.
>> >> Charmrun> client 3 connected (IP=129.215.237.187 data_port=50052)
>> >> Charmrun> All clients connected.
>> >> Charmrun> IP tables sent.
>> >> Charmrun> node programs all connected
>> >> Charmrun> started all node programs in 1.805 seconds.
>> >> Converse/Charm++ Commit ID: v6.4.0-beta1-0-g5776d21
>> >> Charm++> scheduler running in netpoll mode.
>> >> CharmLB> Load balancer assumes all CPUs are same.
>> >>
>> >>
>> >> Output to terminal halts at this point. All node processors are
>> >> running but nothing is written to disk. I see others have had this
>> >> problem before:
>> >>
>> >> http://www.ks.uiuc.edu/Research/namd/mailing_list/namd-l.2011-
>> >> 2012/3776.html
>> >>
>> >> I tried your suggestion of running charmrun with the debug option.
>> >> This causes 8 xterm windows to open, each with the following:
>> >>
>> >> GNU gdb (GDB) Fedora (7.2-16.fc14)
>> >> Copyright (C) 2010 Free Software Foundation, Inc.
>> >> License GPLv3+: GNU GPL version 3 or later
>> >> <http://gnu.org/licenses/gpl.html>
>> >> This is free software: you are free to change and redistribute it.
>> >> There is NO WARRANTY, to the extent permitted by law. Type "show
>> >> copying"
>> >> and "show warranty" for details.
>> >> This GDB was configured as "x86_64-redhat-linux-gnu".
>> >> For bug reporting instructions, please see:
>> >> <http://www.gnu.org/software/gdb/bugs/>...
>> >> Reading symbols from
>> >> /usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2...(no
>> debugging
>> >> symbols found)...done.
>> >> (gdb)
>> >>
>> >>
>> >> What should I try next?
>> >>
>> >> cheers,
>> >> Doug
>> >>
>> >>
>> >> Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on Mon, 2 Jun
>> >> 2014 13:31:08 +0200:
>> >>
>> >> > The error with "HOST IDENTIFICATION HAS CHANGED" means, that the
>> >> entries in
>> >> > "known_hosts" are no more true. Therefore it would be easier to
>> >> delete ALL
>> >> > the "known_hosts" entries from all nodes and recreate them by
>> sshing
>> >> to each
>> >> > other and to localhost and 127.0.0.1. It might be easier if the
>> nodes
>> >> would
>> >> > share the same identification which can be done by mirroring the
>> >> "~/.ssh"
>> >> > folder to all nodes after the clean-up
>> >> >
>> >> > Norman Geist.
>> >> >
>> >> >
>> >> >> -----Ursprüngliche Nachricht-----
>> >> >> Von: Douglas Houston [mailto:DouglasR.Houston_at_ed.ac.uk]
>> >> >> Gesendet: Montag, 2. Juni 2014 13:26
>> >> >> An: Norman Geist
>> >> >> Betreff: Re: AW: AW: AW: namd-l: Using nodelist file causes namd
>> to
>> >> >> hang
>> >> >>
>> >> >> Hi Norman,
>> >> >>
>> >> >> My .ssh/known_hosts contains one line for each of the itioc
>> nodes,
>> >> >> plus one line for 127.0.0.1 and one line for localhost.
>> >> >>
>> >> >> Could you clarify which entries exactly I should delete in this
>> >> file,
>> >> >> and also what you mean by "start over"?
>> >> >>
>> >> >> cheers,
>> >> >> Doug
>> >> >>
>> >> >>
>> >> >> Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on Mon, 2
>> Jun
>> >> >> 2014 13:06:30 +0200:
>> >> >>
>> >> >> > Easiest way is to delete all the nodes "known_hosts" entries
>> and
>> >> >> start over.
>> >> >> >
>> >> >> >
>> >> >> > Norman Geist.
>> >> >> >
>> >> >> >> -----Ursprüngliche Nachricht-----
>> >> >> >> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-
>> l_at_ks.uiuc.edu]
>> >> Im
>> >> >> >> Auftrag von Douglas Houston
>> >> >> >> Gesendet: Montag, 2. Juni 2014 11:51
>> >> >> >> An: Namd Mailing List
>> >> >> >> Betreff: Re: AW: AW: namd-l: Using nodelist file causes namd
>> to
>> >> hang
>> >> >> >>
>> >> >> >> Sorry for the long delay but I ran out of time to continue
>> >> >> >> troubleshooting this, until now.
>> >> >> >>
>> >> >> >> To recap, I have 6 nodes. When I'm logged in to e.g. itioc6 I
>> can
>> >> >> ssh
>> >> >> >> to localhost, 127.0.0.1, and 129.215.237.187 (itioc6's IP
>> >> address).
>> >> >> >> But if I login to e.g. itioc1, I can't ssh to localhost (see
>> >> error
>> >> >> >> message below). If I change the key in
>> ~douglas/.ssh/known_hosts
>> >> to
>> >> >> >> make this work on itioc1 it stops working on itioc6.
>> >> >> >>
>> >> >> >> It looks like I can only have one working "localhost" or
>> >> "127.0.0.1"
>> >> >> >> key in the known_hosts file, but as I understand it I need all
>> my
>> >> >> >> itioc nodes to each have one. How can I achieve this?
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on Wed,
>> 9
>> >> Apr
>> >> >> >> 2014 12:53:50 +0200:
>> >> >> >>
>> >> >> >> > This may be a hint. Your nodes must not only be able to
>> logon
>> >> to
>> >> >> all
>> >> >> >> nodes
>> >> >> >> > without password, but should also be able to logon to
>> >> themselves
>> >> >> via
>> >> >> >> own IP
>> >> >> >> > address, localhost and 127.0.0.1
>> >> >> >> >
>> >> >> >> > You may want to delete the wrong entries in
>> ~/.ssh/known_hosts
>> >> on
>> >> >> the
>> >> >> >> nodes,
>> >> >> >> > and recreate by ssh to the targets mentioned above.
>> >> >> >> >
>> >> >> >> > Norman Geist.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >> -----Ursprüngliche Nachricht-----
>> >> >> >> >> Von: Douglas Houston [mailto:DouglasR.Houston_at_ed.ac.uk]
>> >> >> >> >> Gesendet: Mittwoch, 9. April 2014 12:42
>> >> >> >> >> An: Norman Geist
>> >> >> >> >> Betreff: Re: AW: namd-l: Using nodelist file causes namd to
>> >> hang
>> >> >> >> >>
>> >> >> >> >> The same command without the ++local causes the nodelist
>> file
>> >> to
>> >> >> be
>> >> >> >> >> used, I have already posed the output from this.
>> >> >> >> >>
>> >> >> >> >> If I delete the nodelist file, the same command without the
>> >> >> ++local
>> >> >> >> >> (which causes the file /usr/people/douglas/.nodelist to be
>> >> used)
>> >> >> >> >> outputs:
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> Charmrun> charmrun started...
>> >> >> >> >> Charmrun> using /usr/people/douglas/.nodelist as nodesfile
>> >> >> >> >> Charmrun> adding client 0: "localhost", IP:127.0.0.1
>> >> >> >> >> Charmrun> Charmrun = 129.215.237.187, port = 35909
>> >> >> >> >> start_nodes_rsh
>> >> >> >> >> Charmrun> Sending "0 129.215.237.187 35909 27843 0" to
>> client
>> >> 0.
>> >> >> >> >> Charmrun> find the node program
>> >> >> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-x86/namd2" at
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> " for
>> >> >> >> >> 0.
>> >> >> >> >> Charmrun> Starting ssh localhost -l douglas /bin/sh -f
>> >> >> >> >> Charmrun> remote shell (localhost:0) started
>> >> >> >> >> Charmrun> node programs all started
>> >> >> >> >> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>> >> >> >> >> @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @
>> >> >> >> >> @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
>> >> >> >> >> IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
>> >> >> >> >> Someone could be eavesdropping on you right now (man-in-
>> the-
>> >> >> middle
>> >> >> >> >> attack)!
>> >> >> >> >> It is also possible that the RSA host key has just been
>> >> changed.
>> >> >> >> >> The fingerprint for the RSA key sent by the remote host is
>> >> >> >> >> 99:cb:e0:0a:77:8b:61:fd:19:01:57:93:ec:93:99:63.
>> >> >> >> >> Please contact your system administrator.
>> >> >> >> >> Add correct host key in
>> /usr/people/douglas/.ssh/known_hosts
>> >> to
>> >> >> get
>> >> >> >> >> rid of this message.
>> >> >> >> >> Offending key in /usr/people/douglas/.ssh/known_hosts:47
>> >> >> >> >> RSA host key for localhost has changed and you have
>> requested
>> >> >> strict
>> >> >> >> >> checking.
>> >> >> >> >> Host key verification failed.
>> >> >> >> >> Charmrun> Error 255 returned from rsh (localhost:0)
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> The file /usr/people/douglas/.nodelist contains:
>> >> >> >> >> group main
>> >> >> >> >> host localhost
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> Quoting Norman Geist <norman.geist_at_uni-greifswald.de> on
>> Wed,
>> >> 9
>> >> >> Apr
>> >> >> >> >> 2014 12:28:51 +0200:
>> >> >> >> >>
>> >> >> >> >> > Please try the same command without ++local and see if it
>> >> still
>> >> >> >> >> works.
>> >> >> >> >> >
>> >> >> >> >> >> -----Ursprüngliche Nachricht-----
>> >> >> >> >> >> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-
>> >> >> l_at_ks.uiuc.edu]
>> >> >> >> Im
>> >> >> >> >> >> Auftrag von Douglas Houston
>> >> >> >> >> >> Gesendet: Mittwoch, 9. April 2014 11:49
>> >> >> >> >> >> An: ramya narasimhan
>> >> >> >> >> >> Cc: Namd Mailing List
>> >> >> >> >> >> Betreff: Re: namd-l: Using nodelist file causes namd to
>> >> hang
>> >> >> >> >> >>
>> >> >> >> >> >> The result is the same whichever order the nodes are
>> >> present
>> >> >> in
>> >> >> >> the
>> >> >> >> >> >> list.
>> >> >> >> >> >>
>> >> >> >> >> >> What exactly is Charmrun waiting for at the "Waiting for
>> 0-
>> >> th
>> >> >> >> client
>> >> >> >> >> >> to connect." stage? Presumably the 0th client is the
>> first
>> >> in
>> >> >> >> >> >> nodelist, and that a process is supposed to start on
>> that
>> >> >> node,
>> >> >> >> then
>> >> >> >> >> >> "connect" to Charmrun on the host machine?
>> >> >> >> >> >
>> >> >> >> >> > Charmrun is just spawning the namd processes and now is
>> >> waiting
>> >> >> >> for
>> >> >> >> >> them to
>> >> >> >> >> > start to talk.
>> >> >> >> >> >
>> >> >> >> >> >>
>> >> >> >> >> >> Using the command top I see no evidence of anything new
>> >> >> starting
>> >> >> >> on
>> >> >> >> >> >> the node, despite all the "starting node-program" and
>> "rsh
>> >> >> phase
>> >> >> >> >> >> successful" messages that are output.
>> >> >> >> >> >>
>> >> >> >> >> >> Using "ps -u douglas" on the node shows a whole bunch of
>> >> tcsh
>> >> >> and
>> >> >> >> sh
>> >> >> >> >> >> shells and sleep processes starting then dying but
>> nothing
>> >> >> else.
>> >> >> >> >> >>
>> >> >> >> >> >> What does the line "Sending "0 129.215.237.187 57453
>> 26737
>> >> 0"
>> >> >> to
>> >> >> >> >> >> client 0" mean? How is this "sending" achieved? I see
>> "port
>> >> >> >> 57453"
>> >> >> >> >> is
>> >> >> >> >> >> mentioned in the output ...
>> >> >> >> >> >
>> >> >> >> >> > Seems like being part of the parallel startup, where the
>> >> >> spawned
>> >> >> >> >> processes
>> >> >> >> >> > get the information about each other.
>> >> >> >> >> >
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> Quoting ramya narasimhan <ramya_jln_at_yahoo.co.in> on Wed,
>> 9
>> >> Apr
>> >> >> >> 2014
>> >> >> >> >> >> 11:51:52 +0800 (SGT):
>> >> >> >> >> >>
>> >> >> >> >> >> > Just change the hostname [IP of the system] order in
>> the
>> >> >> >> >> >> > nodefile, so that the 0-th client will be itioc5
>> instead
>> >> >> >> >> of itioc1.
>> >> >> >> >> >> > To find whether the problem is with nodes.
>> >> >> >> >> >> >
>> >> >> >> >> >> >
>> >> >> >> >> >> > Dr. Ramya.L.
>> >> >> >> >> >> > On Tuesday, 8 April 2014 7:23 PM, Douglas Houston
>> >> >> >> >> >> > <DouglasR.Houston_at_ed.ac.uk> wrote:
>> >> >> >> >> >> >
>> >> >> >> >> >> > Yes, with ping all the nodes resolve to full hostnames
>> >> and
>> >> >> IP
>> >> >> >> >> >> > addresses. I tried putting IP addresses into nodelist
>> >> >> instead
>> >> >> >> of
>> >> >> >> >> >> > hostnames but it still times out at "Waiting for 0-th
>> >> client
>> >> >> to
>> >> >> >> >> >> connect"
>> >> >> >> >> >> >
>> >> >> >> >> >> >
>> >> >> >> >> >> > Quoting Norman Geist <norman.geist_at_uni-greifswald.de>
>> on
>> >> >> Tue, 8
>> >> >> >> >> Apr
>> >> >> >> >> >> > 2014 14:30:15 +0200:
>> >> >> >> >> >> >
>> >> >> >> >> >> >> On all the nodes? Otherwise try a nodelist with IP
>> >> adresses
>> >> >> >> >> instead
>> >> >> >> >> >> of
>> >> >> >> >> >> >> hostnames. If that works, you got a problem with
>> local
>> >> DNS.
>> >> >> >> >> >> >>
>> >> >> >> >> >> >> Norman Geist.
>> >> >> >> >> >> >>
>> >> >> >> >> >> >>
>> >> >> >> >> >> >>> -----Ursprüngliche Nachricht-----
>> >> >> >> >> >> >>> Von: Douglas Houston
>> [mailto:DouglasR.Houston_at_ed.ac.uk]
>> >> >> >> >> >> >>> Gesendet: Dienstag, 8. April 2014 14:14
>> >> >> >> >> >> >>> An: Norman Geist
>> >> >> >> >> >> >>> Cc: Namd Mailing List
>> >> >> >> >> >> >>> Betreff: Re: AW: AW: namd-l: Using nodelist file
>> causes
>> >> >> namd
>> >> >> >> to
>> >> >> >> >> >> hang
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>> Thanks Norman. I had found that thread after my
>> >> searches
>> >> >> but
>> >> >> >> it
>> >> >> >> >> did
>> >> >> >> >> >> >>> not seem to apply to my problem.
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>> "You can check this while doing a ping to the
>> hostname,
>> >> >> while
>> >> >> >> >> you
>> >> >> >> >> >> are
>> >> >> >> >> >> >>> logged in at a compute node "ping hostname". If this
>> >> >> returns
>> >> >> >> an
>> >> >> >> >> >> >>> 127.x.x.x address, your local DNS configuration is
>> not
>> >> >> >> suitable
>> >> >> >> >> for
>> >> >> >> >> >> >>> charmrun"
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>> My ping returns the full name and IP address of the
>> >> node,
>> >> >> not
>> >> >> >> >> >> >>> 127.x.x.x.
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>> Quoting Norman Geist <norman.geist_at_uni-
>> greifswald.de>
>> >> on
>> >> >> Tue,
>> >> >> >> 8
>> >> >> >> >> Apr
>> >> >> >> >> >> >>> 2014 13:22:41 +0200:
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>> > Now I remember that I already posted a solution
>> for
>> >> this
>> >> >> >> some
>> >> >> >> >> >> weeks
>> >> >> >> >> >> >>> ago, you
>> >> >> >> >> >> >>> > could have found it by using google.de. Maybe this
>> >> helps
>> >> >> >> you.
>> >> >> >> >> >> >>> >
>> >> >> >> >> >> >>> >
>> >> http://www.ks.uiuc.edu/Research/namd/mailing_list/namd-
>> >> >> >> l.2012-
>> >> >> >> >> >> >>> 2013/2645.html
>> >> >> >> >> >> >>> >
>> >> >> >> >> >> >>> > Norman Geist.
>> >> >> >> >> >> >>> >
>> >> >> >> >> >> >>> >
>> >> >> >> >> >> >>> >> -----Ursprüngliche Nachricht-----
>> >> >> >> >> >> >>> >> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-
>> >> >> >> >> l_at_ks.uiuc.edu]
>> >> >> >> >> >> Im
>> >> >> >> >> >> >>> >> Auftrag von Douglas Houston
>> >> >> >> >> >> >>> >> Gesendet: Dienstag, 8. April 2014 12:53
>> >> >> >> >> >> >>> >> An: Norman Geist
>> >> >> >> >> >> >>> >> Cc: Namd Mailing List
>> >> >> >> >> >> >>> >> Betreff: Re: AW: namd-l: Using nodelist file
>> causes
>> >> >> namd
>> >> >> >> to
>> >> >> >> >> hang
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>> >> Thanks for the tip Norman, but if I change my
>> >> command
>> >> >> to
>> >> >> >> the
>> >> >> >> >> >> >>> following
>> >> >> >> >> >> >>> >> it still hangs at the same point:
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>> >> /usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/charmrun
>> >> >> >> +p12
>> >> >> >> >> >> >>> >> ++remote-shell ssh
>> >> >> >> >> >> >>> >> /usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> x86/namd2
>> >> >> >> >> ++verbose
>> >> >> >> >> >> >>> >> mdrun.conf
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>> >> Quoting Norman Geist <norman.geist_at_uni-
>> >> greifswald.de>
>> >> >> on
>> >> >> >> Tue,
>> >> >> >> >> 8
>> >> >> >> >> >> Apr
>> >> >> >> >> >> >>> >> 2014 12:06:03 +0200:
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>> >> > Try the charmrun option "++remote-shell ssh".
>> >> >> >> >> >> >>> >> >
>> >> >> >> >> >> >>> >> > Norman Geist.
>> >> >> >> >> >> >>> >> >
>> >> >> >> >> >> >>> >> >> -----Ursprüngliche Nachricht-----
>> >> >> >> >> >> >>> >> >> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-
>> namd-
>> >> >> >> >> >> l_at_ks.uiuc.edu]
>> >> >> >> >> >> >>> Im
>> >> >> >> >> >> >>> >> >> Auftrag von Douglas Houston
>> >> >> >> >> >> >>> >> >> Gesendet: Dienstag, 8. April 2014 11:30
>> >> >> >> >> >> >>> >> >> An: namd-l_at_ks.uiuc.edu
>> >> >> >> >> >> >>> >> >> Betreff: namd-l: Using nodelist file causes
>> namd
>> >> to
>> >> >> >> hang
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> I have two nodes connected via ethernet:
>> itioc5
>> >> and
>> >> >> >> itioc1
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> I have the following in my nodelist file:
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> group main
>> >> >> >> >> >> >>> >> >> host itioc1
>> >> >> >> >> >> >>> >> >> host itioc5
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> I am using the following command:
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> /usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> >> x86/charmrun
>> >> >> >> >> +p12
>> >> >> >> >> >> >>> >> >> /usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2
>> >> >> >> >> >> ++verbose
>> >> >> >> >> >> >>> >> >> mdrun.conf
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> I get the following output:
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> Charmrun> charmrun started...
>> >> >> >> >> >> >>> >> >> Charmrun> using ./nodelist as nodesfile
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 0: "itioc1",
>> >> >> IP:129.215.137.21
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 1: "itioc5",
>> >> >> IP:129.215.237.186
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 2: "itioc1",
>> >> >> IP:129.215.137.21
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 3: "itioc5",
>> >> >> IP:129.215.237.186
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 4: "itioc1",
>> >> >> IP:129.215.137.21
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 5: "itioc5",
>> >> >> IP:129.215.237.186
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 6: "itioc1",
>> >> >> IP:129.215.137.21
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 7: "itioc5",
>> >> >> IP:129.215.237.186
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 8: "itioc1",
>> >> >> IP:129.215.137.21
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 9: "itioc5",
>> >> >> IP:129.215.237.186
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 10: "itioc1",
>> >> >> IP:129.215.137.21
>> >> >> >> >> >> >>> >> >> Charmrun> adding client 11: "itioc5",
>> >> >> >> IP:129.215.237.186
>> >> >> >> >> >> >>> >> >> Charmrun> Charmrun = 129.215.237.187, port =
>> >> 58330
>> >> >> >> >> >> >>> >> >> start_nodes_rsh
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "0 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> client
>> >> >> >> >> >> 0.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 0.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc1:0) started
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "1 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> client
>> >> >> >> >> >> 1.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 1.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc5:1) started
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "2 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> client
>> >> >> >> >> >> 2.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 2.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc1:2) started
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "3 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> client
>> >> >> >> >> >> 3.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 3.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc5:3) started
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "4 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> client
>> >> >> >> >> >> 4.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 4.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc1:4) started
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "5 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> client
>> >> >> >> >> >> 5.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 5.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc5:5) started
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "6 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> client
>> >> >> >> >> >> 6.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 6.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc1:6) started
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "7 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> client
>> >> >> >> >> >> 7.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 7.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc5:7) started
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "8 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> client
>> >> >> >> >> >> 8.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 8.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc1:8) started
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "9 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> client
>> >> >> >> >> >> 9.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 9.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc5:9) started
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "10 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> >> client
>> >> >> >> >> >> >>> 10.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 10.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc1 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc1:10) started
>> >> >> >> >> >> >>> >> >> Charmrun> Sending "11 129.215.237.187 58330
>> 19205
>> >> 0"
>> >> >> to
>> >> >> >> >> >> client
>> >> >> >> >> >> >>> 11.
>> >> >> >> >> >> >>> >> >> Charmrun> find the node program
>> >> >> >> >> >> >>> >> >> "/usr/people/douglas/programs/NAMD_2.9_Linux-
>> >> >> x86/namd2"
>> >> >> >> at
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>>
>> >> >> >> >> >>
>> >> >> >> >>
>> >> >> >>
>> >> >>
>> >>
>> "/usr/people/douglas/projects/UPS/targets/SCF/2AST/MD/parallelise_itioc
>> >> >> >> >> >> >>> >> >> " for
>> >> >> >> >> >> >>> >> >> 11.
>> >> >> >> >> >> >>> >> >> Charmrun> Starting ssh itioc5 -l douglas
>> /bin/sh
>> >> -f
>> >> >> >> >> >> >>> >> >> Charmrun> remote shell (itioc5:11) started
>> >> >> >> >> >> >>> >> >> Charmrun> node programs all started
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.3)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.5)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.3)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.5)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.3)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.5)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.9)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.7)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.11)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.1)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.9)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.7)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.9)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.7)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.11)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.1)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.11)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc5.1)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.10)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.0)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.4)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.10)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.10)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.0)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.0)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.4)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.4)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.2)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.6)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.8)> remote
>> >> >> responding...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.2)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.2)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.6)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.6)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.8)> starting
>> node-
>> >> >> >> program...
>> >> >> >> >> >> >>> >> >> Charmrun remote shell(itioc1.8)> rsh phase
>> >> >> successful.
>> >> >> >> >> >> >>> >> >> Charmrun> Waiting for 0-th client to connect.
>> >> >> >> >> >> >>> >> >> Charmrun> error 0 attaching to node:
>> >> >> >> >> >> >>> >> >> Timeout waiting for node-program to connect
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> I'm not sure but I think the "Starting ssh
>> itioc5
>> >> -l
>> >> >> >> >> douglas
>> >> >> >> >> >> >>> /bin/sh
>> >> >> >> >> >> >>> >> >> -f" lines has something to do with it. If I
>> run
>> >> the
>> >> >> >> >> command
>> >> >> >> >> >> "ssh
>> >> >> >> >> >> >>> >> >> itioc5 -l douglas /bin/sh -f" it also hangs.
>> If I
>> >> >> run
>> >> >> >> "ssh
>> >> >> >> >> >> itioc5
>> >> >> >> >> >> >>> -l
>> >> >> >> >> >> >>> >> >> douglas" then it logs me in just fine (without
>> >> >> asking
>> >> >> >> for
>> >> >> >> >> a
>> >> >> >> >> >> >>> >> password).
>> >> >> >> >> >> >>> >> >> Similarly the command "ssh itioc5 -l douglas -
>> f
>> >> pwd"
>> >> >> >> works
>> >> >> >> >> >> fine,
>> >> >> >> >> >> >>> >> with
>> >> >> >> >> >> >>> >> >> the expected directory name returned.
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> What exactly is happening at the "Waiting for
>> 0-
>> >> th
>> >> >> >> client
>> >> >> >> >> to
>> >> >> >> >> >> >>> >> connect."
>> >> >> >> >> >> >>> >> >> stage?
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> Many thanks in advance for your thoughts.
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> cheers,
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> Doug
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >>
>> >> >> _____________________________________________________
>> >> >> >> >> >> >>> >> >> Dr. Douglas R. Houston
>> >> >> >> >> >> >>> >> >> Lecturer
>> >> >> >> >> >> >>> >> >> Institute of Structural and Molecular Biology
>> >> >> >> >> >> >>> >> >> Room 3.23, Michael Swann Building
>> >> >> >> >> >> >>> >> >> King's Buildings
>> >> >> >> >> >> >>> >> >> University of Edinburgh
>> >> >> >> >> >> >>> >> >> Edinburgh, EH9 3JR, UK
>> >> >> >> >> >> >>> >> >> Tel. 0131 650 7358
>> >> >> >> >> >> >>> >> >> http://tinyurl.com/douglasrhouston
>> >> >> >> >> >> >>> >> >>
>> >> >> >> >> >> >>> >> >> --
>> >> >> >> >> >> >>> >> >> The University of Edinburgh is a charitable
>> body,
>> >> >> >> >> registered
>> >> >> >> >> >> in
>> >> >> >> >> >> >>> >> >> Scotland, with registration number SC005336.
>> >> >> >> >> >> >>> >> >
>> >> >> >> >> >> >>> >> >
>> >> >> >> >> >> >>> >> >
>> >> >> >> >> >> >>> >> > ---
>> >> >> >> >> >> >>> >> > Diese E-Mail ist frei von Viren und Malware,
>> denn
>> >> der
>> >> >> >> >> avast!
>> >> >> >> >> >> >>> >> > Antivirus Schutz ist aktiv.
>> >> >> >> >> >> >>> >> > http://www.avast.com
>> >> >> >> >> >> >
>> >> >> >> >> >> >>> >> >
>> >> >> >> >> >> >>> >> >
>> >> >> >> >> >> >>> >> >
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>> >>
>> >> _____________________________________________________
>> >> >> >> >> >> >>> >> Dr. Douglas R. Houston
>> >> >> >> >> >> >>> >> Lecturer
>> >> >> >> >> >> >>> >> Institute of Structural and Molecular Biology
>> >> >> >> >> >> >>> >> Room 3.23, Michael Swann Building
>> >> >> >> >> >> >>> >> King's Buildings
>> >> >> >> >> >> >>> >> University of Edinburgh
>> >> >> >> >> >> >>> >> Edinburgh, EH9 3JR, UK
>> >> >> >> >> >> >>> >> Tel. 0131 650 7358
>> >> >> >> >> >> >>> >> http://tinyurl.com/douglasrhouston
>> >> >> >> >> >> >>> >>
>> >> >> >> >> >> >>> >> --
>> >> >> >> >> >> >>> >> The University of Edinburgh is a charitable body,
>> >> >> >> registered
>> >> >> >> >> in
>> >> >> >> >> >> >>> >> Scotland, with registration number SC005336.
>> >> >> >> >> >> >>> >
>> >> >> >> >> >> >>> >
>> >> >> >> >> >> >>> >
>> >> >> >> >> >> >>> > ---
>> >> >> >> >> >> >>> > Diese E-Mail ist frei von Viren und Malware, denn
>> der
>> >> >> >> avast!
>> >> >> >> >> >> >>> > Antivirus Schutz ist aktiv.
>> >> >> >> >> >> >>> > http://www.avast.com
>> >> >> >> >> >> >>> >
>> >> >> >> >> >> >>> >
>> >> >> >> >> >> >>> >
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>>
>> _____________________________________________________
>> >> >> >> >> >> >>> Dr. Douglas R. Houston
>> >> >> >> >> >> >>> Lecturer
>> >> >> >> >> >> >>> Institute of Structural and Molecular Biology
>> >> >> >> >> >> >>> Room 3.23, Michael Swann Building
>> >> >> >> >> >> >>> King's Buildings
>> >> >> >> >> >> >>> University of Edinburgh
>> >> >> >> >> >> >>> Edinburgh, EH9 3JR, UK
>> >> >> >> >> >> >>> Tel. 0131 650 7358
>> >> >> >> >> >> >>> http://tinyurl.com/douglasrhouston
>> >> >> >> >> >> >>>
>> >> >> >> >> >> >>> --
>> >> >> >> >> >> >>> The University of Edinburgh is a charitable body,
>> >> >> registered
>> >> >> >> in
>> >> >> >> >> >> >>> Scotland, with registration number SC005336.
>> >> >> >> >> >> >>
>> >> >> >> >> >> >>
>> >> >> >> >> >> >>
>> >> >> >> >> >> >> ---
>> >> >> >> >> >> >> Diese E-Mail ist frei von Viren und Malware, denn der
>> >> >> avast!
>> >> >> >> >> >> >> Antivirus Schutz ist aktiv.
>> >> >> >> >> >> >> http://www.avast.com
>> >> >> >> >> >> >>
>> >> >> >> >> >> >>
>> >> >> >> >> >> >>
>> >> >> >> >> >> >
>> >> >> >> >> >> >
>> >> >> >> >> >> >
>> >> >> >> >> >> >
>> >> >> >> >> >> > _____________________________________________________
>> >> >> >> >> >> > Dr. Douglas R. Houston
>> >> >> >> >> >> > Lecturer
>> >> >> >> >> >> > Institute of Structural and Molecular Biology
>> >> >> >> >> >> > Room 3.23, Michael Swann Building
>> >> >> >> >> >> > King's Buildings
>> >> >> >> >> >> > University of Edinburgh
>> >> >> >> >> >> > Edinburgh, EH9 3JR, UK
>> >> >> >> >> >> > Tel. 0131 650 7358
>> >> >> >> >> >> > http://tinyurl.com/douglasrhouston
>> >> >> >> >> >> >
>> >> >> >> >> >> > --
>> >> >> >> >> >> > The University of Edinburgh is a charitable body,
>> >> registered
>> >> >> in
>> >> >> >> >> >> > Scotland, with registration number SC005336.
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> _____________________________________________________
>> >> >> >> >> >> Dr. Douglas R. Houston
>> >> >> >> >> >> Lecturer
>> >> >> >> >> >> Institute of Structural and Molecular Biology
>> >> >> >> >> >> Room 3.23, Michael Swann Building
>> >> >> >> >> >> King's Buildings
>> >> >> >> >> >> University of Edinburgh
>> >> >> >> >> >> Edinburgh, EH9 3JR, UK
>> >> >> >> >> >> Tel. 0131 650 7358
>> >> >> >> >> >> http://tinyurl.com/douglasrhouston
>> >> >> >> >> >>
>> >> >> >> >> >> --
>> >> >> >> >> >> The University of Edinburgh is a charitable body,
>> >> registered
>> >> >> in
>> >> >> >> >> >> Scotland, with registration number SC005336.
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > ---
>> >> >> >> >> > Diese E-Mail ist frei von Viren und Malware, denn der
>> avast!
>> >> >> >> >> > Antivirus Schutz ist aktiv.
>> >> >> >> >> > http://www.avast.com
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> _____________________________________________________
>> >> >> >> >> Dr. Douglas R. Houston
>> >> >> >> >> Lecturer
>> >> >> >> >> Institute of Structural and Molecular Biology
>> >> >> >> >> Room 3.23, Michael Swann Building
>> >> >> >> >> King's Buildings
>> >> >> >> >> University of Edinburgh
>> >> >> >> >> Edinburgh, EH9 3JR, UK
>> >> >> >> >> Tel. 0131 650 7358
>> >> >> >> >> http://tinyurl.com/douglasrhouston
>> >> >> >> >>
>> >> >> >> >> --
>> >> >> >> >> The University of Edinburgh is a charitable body,
>> registered
>> >> in
>> >> >> >> >> Scotland, with registration number SC005336.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > ---
>> >> >> >> > Diese E-Mail ist frei von Viren und Malware, denn der avast!
>> >> >> >> > Antivirus Schutz ist aktiv.
>> >> >> >> > http://www.avast.com
>> >> >> >> >
>> >> >> >> >
>> >> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> _____________________________________________________
>> >> >> >> Dr. Douglas R. Houston
>> >> >> >> Lecturer
>> >> >> >> Institute of Structural and Molecular Biology
>> >> >> >> Room 3.23, Michael Swann Building
>> >> >> >> King's Buildings
>> >> >> >> University of Edinburgh
>> >> >> >> Edinburgh, EH9 3JR, UK
>> >> >> >> Tel. 0131 650 7358
>> >> >> >> http://tinyurl.com/douglasrhouston
>> >> >> >>
>> >> >> >> --
>> >> >> >> The University of Edinburgh is a charitable body, registered
>> in
>> >> >> >> Scotland, with registration number SC005336.
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > ---
>> >> >> > Diese E-Mail ist frei von Viren und Malware, denn der avast!
>> >> >> > Antivirus Schutz ist aktiv.
>> >> >> > http://www.avast.com
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >>
>> >> >> _____________________________________________________
>> >> >> Dr. Douglas R. Houston
>> >> >> Lecturer
>> >> >> Institute of Structural and Molecular Biology
>> >> >> Room 3.23, Michael Swann Building
>> >> >> King's Buildings
>> >> >> University of Edinburgh
>> >> >> Edinburgh, EH9 3JR, UK
>> >> >> Tel. 0131 650 7358
>> >> >> http://tinyurl.com/douglasrhouston
>> >> >>
>> >> >> --
>> >> >> The University of Edinburgh is a charitable body, registered in
>> >> >> Scotland, with registration number SC005336.
>> >> >
>> >> >
>> >> >
>> >> > ---
>> >> > Diese E-Mail ist frei von Viren und Malware, denn der avast!
>> >> > Antivirus Schutz ist aktiv.
>> >> > http://www.avast.com
>> >> >
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >>
>> >> _____________________________________________________
>> >> Dr. Douglas R. Houston
>> >> Lecturer
>> >> Institute of Structural and Molecular Biology
>> >> Room 3.23, Michael Swann Building
>> >> King's Buildings
>> >> University of Edinburgh
>> >> Edinburgh, EH9 3JR, UK
>> >> Tel. 0131 650 7358
>> >> http://tinyurl.com/douglasrhouston
>> >>
>> >> --
>> >> The University of Edinburgh is a charitable body, registered in
>> >> Scotland, with registration number SC005336.
>> >>
>> >
>> >
>> >
>> > ---
>> > Diese E-Mail ist frei von Viren und Malware, denn der avast!
>> > Antivirus Schutz ist aktiv.
>> > http://www.avast.com
>> >
>> >
>> >
>> >
>>
>>
>>
>>
>> _____________________________________________________
>> Dr. Douglas R. Houston
>> Lecturer
>> Institute of Structural and Molecular Biology
>> Room 3.23, Michael Swann Building
>> King's Buildings
>> University of Edinburgh
>> Edinburgh, EH9 3JR, UK
>> Tel. 0131 650 7358
>> http://tinyurl.com/douglasrhouston
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>
>
>
> ---
> Diese E-Mail ist frei von Viren und Malware, denn der avast!
> Antivirus Schutz ist aktiv.
> http://www.avast.com
>
>
>
>
_____________________________________________________
Dr. Douglas R. Houston
Lecturer
Institute of Structural and Molecular Biology
Room 3.23, Michael Swann Building
King's Buildings
University of Edinburgh
Edinburgh, EH9 3JR, UK
Tel. 0131 650 7358
http://tinyurl.com/douglasrhouston
-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:20:52 CST