From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Tue Nov 27 2012 - 02:09:01 CST
this problem has been posted a lot. The error you see is due the
incompatibility of the precompiled ibverbs stuff vs. your ib installation.
There are two possibilities to solve this:
1. Use a non ibverbs binary with IPoIB
2. Compile namd with ibverbs on your own.
Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von Thomas Evangelidis
Gesendet: Montag, 26. November 2012 14:57
Cc: Norman Geist
Betreff: namd-l: how to run NAMD-CUDA on multiple nodes
Although I can run the ibverbs binary with CUDA on a single node, on
multiple nodes I get:
Charmrun> error 0 attaching to node:
Timeout waiting for node-program to connect
Charmrun> IBVERBS version of charmrun
I use this command line in my pbs script for ibverbs binary with CUDA:
$NAMD_BIN/charmrun ++runscript ./runscript.csh ++verbose ++remote-shell ssh
++nodelist $nodefile +p24 $NAMD_BIN/namd2 +setcpuaffinity +idlepoll
runscript.csh contents are:
setenv LD_LIBRARY_PATH "$NAMD_BIN:$LD_LIBRARY_PATH"
Is this the way to run NAMD-ibverbs-cuda on multiple nodes? If not could you
please give me the right command line?
This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:22:47 CST