Re: Charm++ supports RoCE ?

From: Hyun (biophysics1_at_gmail.com)
Date: Mon Mar 20 2017 - 22:22:40 CDT

Dear NAMD users and developers

I want to add more info about my previous email, compiling of gpu version
of NAMD on ACCRE cluster.

After building Charm++/Converse library (InfiniBand version) on
ACCRE cluster, I tested it but got error messages as below.
I am wondering whether this error can be fixed.

However, compiling of NAMD for single-node, multicore version was okay and
works well.

Could you give some comments ?

Thanks

Hyun.

******Build and test the Charm++/Converse library (InfiniBand
version):******

*./build charm++ verbs-linux-x86_64 gcc smp --with-production*

cd verbs-linux-x86_64-smp-gcc
make pgm
 ./charmrun ++remote-shell ssh +p4 ./pgm

****************** error
**********************************************************
Charmrun> scalable start enabled.
Charmrun> IBVERBS version of charmrun
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Reconnection attempt 1 of 3
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Reconnection attempt 2 of 3
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Reconnection attempt 3 of 3
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Charmrun> Error 255 returned from remote shell (localhost:0)
Charmrun> Too many reconnection attempts; bailing out
********************************************************************************

On Thu, Mar 16, 2017 at 6:06 PM, Hyun <biophysics1_at_gmail.com> wrote:

> Dear NAMD users and developers
>
> I am trying to compile gpu version of NAMD on ACCRE cluster (
> http://www.accre.vanderbilt.edu/ )
>
> ACCRE cluster uses *RoCE* - RDMA over Converged Ethernet (network
> protocol)
>
> I have a question.
>
> Does Charm++ supports RoCE ?
>
> I compiled gpu version of NAMD, but test failed.
> So I am wondering whether Charm++ supports RoCE or not.
>
> Any comment will be appreciated.
>
> Thanks
>
> Hyun
>

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2018 - 23:20:10 CST