From: Robert Wohlhueter (bobwohlhueter_at_earthlink.net)
Date: Sun Feb 14 2010 - 13:13:16 CST

I'm trying to move from a dual-core Linux machine to a 32-node x 2 cpu
Mac PPC cluster (running namd2-2.7b2). The config files I'm using work
on the Linux computer. It would be too verbose to relate all the
permutations I've tried (none of which have been successful), so I cite
just the most recent failure:

Ultimately I submit the job to the cluster with the command: `qsub -cwd
-pe openmpi 8 runNAMD`, where 8 is the number of processors called for
in this particular run. the "runNAMD" script is:

*********************************************************************************************

#!/bin/csh

setenv NSLOTS 8

setenv MachineFile "/Volumes/RAID/common/hostfile"

setenv NAMDpath "/common/Applications/NAMD_2.7b2_MacOSX-PPC"

setenv PATH ${PATH}:$NAMDpath

setenv NAMDcmd "$NAMDpath/charmrun +p$NSLOTS ++nodelist ./mach_nodes
$NAMDpath/namd2 2htq_box_test.config"

# setenv NAMDcmd "$NAMDpath/namd2 +p$NSLOTS 2htq_box_test.config" gives
error, must use charmrun

# setenv NAMDcmd "$NAMDpath/namd2 2htq_box_test.config"

echo " "
echo "Running NAMD2 ( via $NAMDpath/charmrun ) ... "
echo " "
date

# $NAMDpath/charmrun $NAMDpath/namd2 -machinefile $MachineFile -np
$NSLOTS \
# $NAMDpath/namd2 -machinefile $MachineFile -np $NSLOTS \

/common/ompi11/bin/mpiexec -machinefile $MachineFile -np $NSLOTS $NAMDcmd

echo " "
echo "Done with NAMD"
date
echo " "

**************************************************************************

An excerpt (several repetitions of such messages) from the
"runNAMD.e1095" redirected standard error file is:

Charmrun> Error 128 returned from rsh (node001.cluster.private:0)
/common/sge/util/arch: fork: Resource temporarily unavailable
/bin/sh: fork: Resource temporarily unavailable
/common/sge/util/arch: fork: Resource temporarily unavailable
bash: fork: Resource temporarily unavailable
Charmrun> Error 128 returned from rsh (node001.cluster.private:0)
ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for
'defaults'
/bin/sh: fork: Resource temporarily unavailable
bash: fork: Resource temporarily unavailable
bash: line 1: =/common/sge/lib/darwin:$: No such file or directory
Received disconnect from 192.168.2.1: 2: fork failed: Resource
temporarily unavailable
bash: fork: Resource temporarily unavailable
bash: fork: Resource temporarily unavailable
ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for
'defaults'
bash: fork: Resource temporarily unavailable
/common/sge/util/arch: fork: Resource temporarily unavailable
bash: line 1: =/common/sge/lib/darwin:$: No such file or directory
bash: fork: Resource temporarily unavailable
bash: fork: Resource temporarily unavailable
bash: fork: Resource temporarily unavailable
/common/sge/util/arch: fork: Resource temporarily unavailable
bash: fork: Resource temporarily unavailable
/common/sge/util/arch: fork: Resource temporarily unavailable
ModuleCmd_Load.c(199):ERROR:105: Unable to locate a modulefile for
'defaults'
bash: fork: Resource temporarily unavailable
/bin/sh: fork: Resource temporarily unavailable

**********************************************************

Looking at section 15.2 of the "NAMD Users Guide" and the
"CHARM++/Converse Installation and Usage", and trying several
suggestions in them, hasn't helped. I concerned by a note on the
website (../namd/2.6/ug/node43.html), which states that "..a parallel
program depends on a platform-specific library such as MPI to launch.."
and "..you will likely need to recompile NAMD and its underlying Charm++
libraries to use these machines in parallel.."

Succinctly put: Is there a hint or "recipe" of how to run namd2 on such
a Mac cluster?

Thanks,

Bob Wohlhueter,
Georgia State University, Dept. of Chemistry