Re: NAMD slows at startup phase 1 smp problem

From: Alexander Tzanov (Alexander.Tzanov_at_csi.cuny.edu)
Date: Thu Jan 15 2015 - 10:05:16 CST

Hi Jim/Guys

Sorry for bother you with this problem - recently I am seeing intermitted problem with NAMD 10.
Sometimes it gives the following error:

Processor 0 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: Unable to open binary file vararea.out.restart.coor: File exists
Charm++ fatal error:
FATAL ERROR: Unable to open binary file vararea.out.restart.coor: File exist

Permissions are right however.

I googled error but only 2 old mails from old mail list pop up. Have anyone see the same
problem? Or I should direct the question to PPL? I am running on 128 cores on a large IB cluster.

Thanks

Alex
Alexander Tzanov,PhD
CUNY-CSI

On Jan 14, 2015, at 6:16 PM, Jim Phillips <jim_at_ks.uiuc.edu<mailto:jim_at_ks.uiuc.edu>> wrote:

Hi Ryan,

First, if at all possible avoid MPI-smp in favor of ibverbs-smp, assuming you do have InfiniBand. Then "charmrun ++mpiexec" will use mpiexec to launch across nodes. This works with most MPI versions, and you can specify a runscript to fix the rest. If you don't have a InfiniBand (or a Cray, or just maybe 10Gbit ethernet) then multi-node runs are going to be slow, period--_000_725E9A15669D45A99743996C56799661csicunyedu_--

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:21:33 CST