Re: running precompiled NAMD with multiple processors

From: Axel Kohlmeyer (akohlmey_at_cmm.chem.upenn.edu)
Date: Tue Aug 26 2008 - 09:41:49 CDT

On Tue, 26 Aug 2008, Neelanjana Sengupta wrote:

NS> Dear NAMD community,

dear neelanjana,

NS> I have installed a precompiled version of NAMD (Linux i-686 version) into
NS> our 32 node Intel cluster. The head node is type 'DL 380 GS', and the

this machine description is not very helpful. it doesn't say anything
about the cpu or the hardware in it. model numbers are like house
numbers: they don't say anything about whether there is a palace or
a rotten shack at that location either. ;)

NS> computational nodes are type 'BL 460 C'. The cluster has Intel MPI. We use

please note, that if you are using the precompiled NAMD binary
that you are _not_ using MPI.

NS> PBS for submitting jobs, and this is the submit file:
NS>
NS> ****************************
[...]

NS> These are the errors I am getting:
NS>
NS> connect to address 127.0.0.1: Connection refused
NS> connect to address 127.0.0.1: Connection refused
NS> trying normal rsh (/usr/bin/rsh)

this is a charm++ and not a NAMD error. this originates
from trying to set up communication links via (kerberized)
rsh, which is not configured (or blocked by the firewall)
on the machine. have a look at the NAMD release notes and
search for CONV_RSH to learn how you can set this to ssh
or other alternate mechanisms for how to start/run parallel
NAMD over TCP/IP networks with charm++.

NS> connect to address 127.0.0.1: Connection refused
NS> connect to address 127.0.0.1: Connection refused
NS> connect to address 127.0.0.1: Connection refused
NS> trying normal rsh (/usr/bin/rsh)
NS> connect to address 127.0.0.1: Connection refused
NS> trying normal rsh (/usr/bin/rsh)
NS> connect to address 127.0.0.1: Connection refused
NS> connect to address 127.0.0.1: Connection refused
NS> trying normal rsh (/usr/bin/rsh)
NS> localhost.localdomain: Connection refused
NS> localhost.localdomain: Connection refused
NS> localhost.localdomain: Connection refused
NS> localhost.localdomain: Connection refused
NS> Charmrun> Error 1 returned from rsh (localhost:0)
NS> ****************************
NS>
NS> I was wondering if anybody has faced something like this, and would like to
NS> share their experience in solving the issue. Please note that I have been

this comes up regularly...

NS> able to run NAMD with a single processor on the command line (hence the
NS> precompiled version should be compatible with our cluster).

cheers,
   axel.

NS>
NS> Thanks,
NS> Neelanjana
NS>
NS> ~~~~~~~~~~~~~~~~~~~~
NS> Dr. Neelanjana Sengupta
NS> Physical and Materials Chemistry Division
NS> National Chemical Laboratory
NS> Dr. Homi Bhaba Road
NS> Pune 411008, India
NS> Phone: +91-20-2590 2087
NS> ~~~~~~~~~~~~~~~~~~~~
NS>

-- 
=======================================================================
Axel Kohlmeyer   akohlmey_at_cmm.chem.upenn.edu   http://www.cmm.upenn.edu
   Center for Molecular Modeling   --   University of Pennsylvania
Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
tel: 1-215-898-1582,  fax: 1-215-573-6233,  office-tel: 1-215-898-5425
=======================================================================
If you make something idiot-proof, the universe creates a better idiot.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 05:21:15 CST