Re: Crashing ibverbs binary of NAMD_2.7b2

From: Philipp M. (philipp.coloc_at_gmx.de)
Date: Thu Feb 11 2010 - 05:30:32 CST

Moreover, if I use 'export CONV_RSH=ssh' apparently it uses ssh (I didn't have the RSA key fingerprint for the machine in the beginning, so ssh asked to confirm it and I did so, it is configured not to uses passwords) but then I get the same error message as in my first mail.

>I get:
>
>Charmrun does not recognize the flag '-rsh=ssh'.

-------- Original-Nachricht --------
> Datum: Wed, 10 Feb 2010 17:44:24 -0600
> Von: snoze pa <snoze.pa_at_gmail.com>
> An: namd-l_at_ks.uiuc.edu
> Betreff: Re: namd-l: Crashing ibverbs binary of NAMD_2.7b2

> add one more option in charmrun line: -rsh=ssh
>
> On Wed, Feb 10, 2010 at 10:58 AM, Philipp M. <philipp.coloc_at_gmx.de> wrote:
> > Hi,
> >
> > besides trying to compile the source code (please reply to my previous
> post), I also try to run the ibverbs binary.
> > This is the ++verbose output:
> >
> > Charmrun> charmrun started...
> > Charmrun> using ./nodelist as nodesfile
> > Charmrun> adding client 0: "master0", IP:127.0.0.1
> > Charmrun> adding client 1: "master0", IP:127.0.0.1
> > Charmrun> Charmrun = master0, port = 43809
> > Charmrun> IBVERBS version of charmrun
> > Charmrun> Sending "0 master0 43809 32619 0" to client 0.
> > Charmrun> find the node program
> "/usr/local/testit/NAMD_2.7b2_Linux-x86_64-ibverbs/namd2" at "/data/nvt_run8a" for 0.
> > Charmrun> Starting rsh master0 -l user /bin/sh -f
> > Charmrun> remote shell (master0:0) started
> > Charmrun> Sending "1 master0 43809 32619 0" to client 1.
> > Charmrun> find the node program
> "/usr/local/testit/NAMD_2.7b2_Linux-x86_64-ibverbs/namd2" at "/data/nvt_run8a" for 1.
> > Charmrun> Starting rsh master0 -l user /bin/sh -f
> > Charmrun> remote shell (master0:1) started
> > Charmrun> node programs all started
> > Charmrun remote shell(master0.1)> remote responding...
> > Charmrun remote shell(master0.0)> remote responding...
> > Charmrun remote shell(master0.1)> starting node-program...
> > Charmrun remote shell(master0.1)> rsh phase successful.
> > Charmrun remote shell(master0.0)> starting node-program...
> > Charmrun remote shell(master0.0)> rsh phase successful.
> > Charmrun> Waiting for 0-th client to connect.
> > Charmrun> error 93620 attaching to node:
> > Socket closed before recv.
> >
> > In the charm++ FAQ you find:
> > "typically means a segmentation fault"
> >
> > So is this really a bug? Let me repeat that our mpi-compiled version of
> NAMD_2.7b1 works fine with our infiniband implementation.
> >
> > Best,
> > Philipp
> >
> >
> > --
> > Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox
> 3.5 -
> > sicherer, schneller und einfacher! http://portal.gmx.net/de/go/chbrowser
> >
> >

-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
-- 
NEU: Mit GMX DSL über 1000,- ¿ sparen!
http://portal.gmx.net/de/go/dsl02

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 05:22:43 CST