Re: charmrun constantly hanging

From: jani vinod (genomejani_at_gmail.com)
Date: Sat Nov 26 2011 - 01:06:22 CST

Hello,
I am using ssh and able to login on all the nodes without password
I have tried charmrun in verbose mode and tried other troubleshooting given
in namdwiki
and by comments on that link means that the option and suggestion provided
didn't work for me .
I run charmm++ test and they run fine.
we have infiniband connection for our cluster.
I tried the precombiled ibverbs binary and they run fine but when i
compiled the source code using net-linux option its not working for more
than one node.

Thanks
vinod

On Fri, Nov 25, 2011 at 7:57 PM, Axel Kohlmeyer <akohlmey_at_gmail.com> wrote:

> On Fri, Nov 25, 2011 at 1:59 AM, jani vinod <genomejani_at_gmail.com> wrote:
> > Dear All,
> > I am trying to do apoa1 benchmark using namd 2.8 on Hp cluster with intel
> > processors .
> > I have compiled charmrun using Linux-x86_64-g++.
> > It work fine for 1 node but when tried to run on more than one node
> > it hangs with following message in log file.
> > Charm++> scheduler running in netpoll mode.
>
> have you tried running charmrun in verbose mode?
> can you connect to the other nodes without a password?
> do you have to use rsh or ssh?
> can you run any of the charm++ tests?
> have you tried a precompiled namd package?
> what kind of interconnect does your cluster have?
>
> > I saw the following post also but didn't seems to be helpful
> > http://lists.cs.uiuc.edu/pipermail/charm/2011-April/000587.html
>
> what do you mean by "seems not helpful"? did you follow
> the suggestion and examples as given there?
>
> as you can see from the questions above, there is a *ton*
> of things that you can do, that are described in the
> documentation and that common sense will tell you.
>
> axel.
>
> > Thanks
> > vinod
> >
>
>
>
> --
> Dr. Axel Kohlmeyer
> akohlmey_at_gmail.com http://goo.gl/1wk0
>
> College of Science and Technology
> Temple University, Philadelphia PA, USA.
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:57:58 CST