Error: transport retry exceeded error

From: Alexandre A. Vakhrouchev (makaveli.lcf_at_gmail.com)
Date: Thu May 15 2008 - 13:41:21 CDT

Hi all!

During 2000K atoms system simulation I got following error message:

namd2: Rank 0:29: MPI_Iprobe: ibv_poll_cq(): bad status 12
namd2: Rank 0:28: MPI_Iprobe: ibv_poll_cq(): bad status 12
namd2: Rank 0:28: MPI_Iprobe: self node9.eth0.mvs50k.jscc.ru peer
node26.eth0.mvs50k.jscc.ru (rank: 115)
namd2: Rank 0:28: MPI_Iprobe: error message: transport retry exceeded error
namd2: Rank 0:29: MPI_Iprobe: self node9.eth0.mvs50k.jscc.ru peer
node26.eth0.mvs50k.jscc.ru (rank: 117)
namd2: Rank 0:29: MPI_Iprobe: error message: transport retry exceeded error
namd2: Rank 0:28: MPI_Iprobe: Internal MPI error
namd2: Rank 0:29: MPI_Iprobe: Internal MPI error
MPI Application rank 29 exited before MPI_Finalize() with status 16

NAMD runs for different number of steps before getting this error and
fail. Sometimes it hangs at startup phase. Other simulations work good
for less number of atoms (I tried up to 200K atoms). Cluster is dual
quadcore Xeon on Infiniband. I built NAMD for Linux-amd64-MPI
according to Wiki
http://www.ks.uiuc.edu/Research/namd/wiki/index.cgi?NamdOnInfiniBand
Francly speaking the only thing I didn't use was "-thread pthreads"
for CHARMOPTS because linker exited with error that there was no
pthreads found. May it be the case?

-- 
Best regards,
Dr. Alexander Vakhrushev
Institute of Applied Mechanics
Dep. of Mech. and Phys.-Chem.
of heterogeneous mediums
UB of Russian Academy of Sciences
34 T. Baramzinoy St.
Izhevsk, Russia 426067

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:47:48 CST