From: Scott Brozell (srb_at_osc.edu)
Date: Thu May 18 2017 - 13:28:52 CDT
Hi,
Presumably your cluster is on a trusted network, nevertheless:
1. I would not use an automatic workaround. Instead apply the
scientific method - keep a record of these instances and report
them to your cluster support staff. These are unusual events
in my experience. Even in the most likely case that there is
nothing suspicious going on, your cluster should have a policy
and notification mechanism for the underlying issue (which is
possibly merely cluster maintenance).
2. If you use this automatic workaround then make the pattern
more specific to your cluster's hostname, ie, replace the asterisk
with yourhost.org
scott
On Thu, May 18, 2017 at 07:27:27AM +0000, Zeki Zeybek wrote:
> I somehow figured out a more crude way of handling the problem. Simply just open a new file specifically named
>
> as "config", file name must be config. Then add the following inside the file config. Make sure that the config file is located in your main account directory not scratch i.e. clustername/home/accountName/.ssh.
>
>
> Add those into the config file,
>
>
> Host *
> StrictHostKeyChecking no
>
>
> ________________________________
> From: Zeki Zeybek
> Sent: 12 May 2017 10:05:13
> To: Boonstra, S.; namd-l_at_ks.uiuc.edu
> Subject: Re: namd-l: CHARMRUN ERROR
>
> Thank you for your help and also for explaining the cause behind the problem but interestingly the problem is somehow solved by itself. I tried to start the simulation just after an hour or so it worked like a charm. Once again thank you for the insight about the issue.
>
> Get Outlook for Android<https://aka.ms/ghei36>
>
> ________________________________
> From: Boonstra, S. <s.boonstra_at_rug.nl>
> Sent: Thursday, May 11, 2017 11:03:38 PM
> To: namd-l_at_ks.uiuc.edu; Zeki Zeybek
> Subject: Re: namd-l: CHARMRUN ERROR
>
> Hi Zeki,
>
> I dealt with the same problem on our cluster just yesterday.
>
> Possibly, the RSA fingerprint of the node(s) has changed.
> See also http://www.ks.uiuc.edu/Research/namd/mailing_list/namd-l.2013-2014/2465.html
> and
> https://askubuntu.com/questions/45679/ssh-connection-problem-with-host-key-verification-failed-error
>
> You can renew the fingerprints (they end up in .ssh/known_hosts) of all the nodes (or nodes in $server_list)
> with a (bash) script like
>
> server_list=`sinfo -N --format="%N" | sort -u | grep tcn1[67]` #slurm specific
> for h in $server_list; do
> printf "$h " #verbose
> ip=$(dig +search +short $h)
> ssh-keygen -R $h
> ssh-keygen -R $ip
> ssh-keyscan -H $ip >> ~/.ssh/known_hosts
> ssh-keyscan -H $h >> ~/.ssh/known_hosts
> done
> print #verbose
>
>
> On Thu, May 11, 2017 at 9:39 AM, Zeki Zeybek <zeki.zeybek_at_bilgiedu.net<mailto:zeki.zeybek_at_bilgiedu.net>> wrote:
>
> Hi!
>
>
> Everything has been running smoothly till today. I did not change anything in the script or in the config file. The error output is;
>
> sardalya>> name of the partition in which I am trying to use the nodes
>
>
> ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory^M
> Host key verification failed.^M
> Charmrun> Error 255 returned from remote shell (sardalya78:0)
> Charmrun> Reconnection attempt 1 of 3
> ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory^M
> Host key verification failed.^M
> ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory^M
> Host key verification failed.^M
> ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory^M
> Host key verification failed.^M
> Charmrun> Error 255 returned from remote shell (sardalya79:1)
> Charmrun> Reconnection attempt 1 of 3
> Charmrun> Error 255 returned from remote shell (sardalya80:2)
> Charmrun> Reconnection attempt 1 of 3
> Charmrun> Error 255 returned from remote shell (sardalya81:3)
> Charmrun> Reconnection attempt 1 of 3
> ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory^M
> Host key verification failed.^M
> Charmrun> Error 255 returned from remote shell (sardalya78:0)
> Charmrun> Reconnection attempt 2 of 3
> ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory^M
> Host key verification failed.^M
> ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory^M
> Host key verification failed.^M
> Charmrun> Error 255 returned from remote shell (sardalya79:1)
> Charmrun> Reconnection attempt 2 of 3
> Charmrun> Error 255 returned from remote shell (sardalya80:2)
> Charmrun> Reconnection attempt 2 of 3
> ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory^M
> ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory^M
> Host key verification failed.^M
> Host key verification failed.^M
> Charmrun> Error 255 returned from remote shell (sardalya81:3)
> Charmrun> Reconnection attempt 2 of 3
> Charmrun> Error 255 returned from remote shell (sardalya78:0)
> Charmrun> Reconnection attempt 3 of 3
> ssh_askpass: exec(/usr/libexec/openssh/ssh-askpass): No such file or directory^M
> Host key verification failed
> Charmrun> Error 255 returned from remote shell (sardalya81:3)
> Charmrun> Reconnection attempt 3 of 3
> Charmrun> Error 255 returned from remote shell (sardalya78:0)
> Charmrun> Too many reconnection attempts; bailing out
This archive was generated by hypermail 2.1.6 : Sun Dec 31 2017 - 23:21:18 CST