Re: RE: FATAL ERROR with "File exists"

From: Seibold, Steve Allan (stevesei_at_ku.edu)
Date: Fri Jan 18 2019 - 09:37:24 CST

I checked my version also. I am running Linux-86_64(64-bit Intel/AMD with ethernet). As with Yan, I am confused on which version I should download for my purposes and what I should use (Charmrun or mpirun). My administrator, from what I understood, said he believed with "openmpi" I did not need to "list nodes" for Slurm or use charmrun.

I need a version of NAMD that utilizes ethernet (as opposed to InfiniBand) for running across machines on CPUs and NOT on GPUs.

I did try to run "mpirun -np 32 /directory/namd2 file.cong" but this gave the same error.

Also, you (Joshua) are correct, I am not seeing what you do in your NAMD output. I am seeing the following:

"Based on Charm++/Converse 60800 for net-linux-x86_64-iccstatic

Info: 1 NAMD 2.13 for Linux-x86_64"

I did see after looking for these lines one that says ""net-* deprecated (Chrm >= 6.8.0), please use netlrts"

Is this latter version the one to correct my problem?

Steve

________________________________
From: Zhang Yan <yanzhang_at_moon.ibp.ac.cn>
Sent: Thursday, January 17, 2019 9:16 PM
To: namd-l_at_ks.uiuc.edu; Vermaas, Joshua
Cc: Seibold, Steve Allan
Subject: Re: namd-l: RE: FATAL ERROR with "File exists"

I checked the version I used, It was the version NAMD_2.10_Linux-x86_64-multicore. Now I changed it to NAMD_2.13_Linux-x86_64-netlrts, the same problems occurs. Would you please tell which is the right mpi version?
On the other hand, I$B!G(Bm not clear what$B!G(Bs the difference between charmrun and mpirun, and when I need to use charmrun and under what condition I need to use mpirun. Sorry about the naive question. Hope I can solve my problem according to your help and explanation.

Regards,
Yan

$B:_(B 2019$BG/(B1$B7n(B18$BF|!$>e8a(B2:36$B!$(BVermaas, Joshua <Joshua.Vermaas_at_nrel.gov<mailto:Joshua.Vermaas_at_nrel.gov>> $B<LF;!'(B

Whenever I see mpi being invoked and things taking longer than they should, my first question is always what version of NAMD are you running? See my response below. Its also considered bad form to hijack threads. Electrons are free(ish)! Make your own thread please.

-Josh

On 2019-01-16 20:10:47-07:00 owner-namd-l_at_ks.uiuc.edu<mailto:owner-namd-l_at_ks.uiuc.edu> wrote:

Hi,
I$B!G(Bm using MDFF to make flexible fitting of the X-ray coordinate to my EM map. The sample has helical symmetry. So I make the namd configure with helical symmetry constrains. But after I submit my job, it looks like the computers have been keeping at the "Info: CREATING 48136 COMPUTE OBJECTS " stage for several days without no more new output. Why the processors can't go to the next computation stage, my system is too big? Any suggestions are appreciated.
Regards,
Yan
The following is the last several lines of the log file after I submit my namd job, which always stay at this stage and no more lines are output:
Info: Startup phase 5 took 4.29153e-05 s, 289.473 MB of memory in use
Info: PATCH GRID IS 28 BY 25 BY 4
Info: PATCH GRID IS 1-AWAY BY 1-AWAY BY 1-AWAY
Info: REMOVING COM VELOCITY 0.00757615 0.0131683 0.0256697
Info: LARGEST PATCH (10) HAS 1641 ATOMS
Info: TORUS A SIZE 1 USING 0
Info: TORUS B SIZE 1 USING 0
Info: TORUS C SIZE 1 USING 0
Info: TORUS MINIMAL MESH SIZE IS 1 BY 1 BY 1
Info: Placed 100% of base nodes on same physical node as patch
Info: Startup phase 6 took 0.162006 s, 323.578 MB of memory in use
Info: Startup phase 7 took 0.000121117 s, 323.578 MB of memory in use
Info: Startup phase 8 took 7.70092e-05 s, 323.578 MB of memory in use
LDB: Central LB being created...
Info: Startup phase 9 took 0.00112295 s, 323.578 MB of memory in use
Info: CREATING 48136 COMPUTE OBJECTS
Info: PATCH GRID IS 28 BY 25 BY 4
Info: PATCH GRID IS 1-AWAY BY 1-AWAY BY 1-AWAY
Info: REMOVING COM VELOCITY 0.00757615 0.0131683 0.0256697
Info: LARGEST PATCH (10) HAS 1641 ATOMS
Info: TORUS A SIZE 1 USING 0
Info: TORUS B SIZE 1 USING 0
Info: TORUS C SIZE 1 USING 0
Info: TORUS MINIMAL MESH SIZE IS 1 BY 1 BY 1
Info: Placed 100% of base nodes on same physical node as patch
Info: Startup phase 6 took 0.162988 s, 323.578 MB of memory in use
Info: Startup phase 7 took 0.000114918 s, 323.578 MB of memory in use
Info: Startup phase 8 took 8.60691e-05 s, 323.578 MB of memory in use
LDB: Central LB being created...
Info: Startup phase 9 took 0.00116205 s, 323.578 MB of memory in use
Info: CREATING 48136 COMPUTE OBJECTS
 The following is my job script:
#!/bin/bash
### Inherit all current environment variables
#PBS -V
### Job name
#PBS -N 43nm
### Keep Output and Error
#PBS -k eo
### Queue name
#PBS -q quick
### Specify the number of nodes and thread (ppn) for your job.
#PBS -l nodes=14:ppn=20
#################################
### Switch to the working directory;
cd $PBS_O_WORKDIR
#Environment
source ~/.bashrc
NP=`echo 20 14 | gawk '//{print $1*$2}'`
###Run:
echo "starting mdff..."
#charmrun namd2 +p $NP 9-step1.namd > test.log
mpirun --bynode -np $NP namd2 9-step1.namd > test.log
echo "done"
 The following is my namd2 configure file:
### Docking -- Step 1
set PSFFILE 9_autopsf.psf
set PDBFILE 9_autopsf.pdb
set GRIDPDB 9-grid.pdb
set GBISON 0
set DIEL 80
set SCALING_1_4 1.0
set ITEMP 300
set FTEMP 300
set GRIDFILE 43nm-grid.dx
set GSCALE 6
set EXTRAB {9-extrabonds.txt 9-extrabonds-cispeptide.txt 9-extrabonds-chirality.txt}
set CONSPDB 0
set FIXPDB 0
set GRIDON 1
set OUTPUTNAME 9-step1
set TS 100000
set MS 2000
set MARGIN 0
####################################
symmetryRestraints on
symmetryfile helix-symmetry.pdb
symmetryk 20
symmetryMatrixFile matrix.txt
symmetryfirststep 2001
symmetryfirstfullstep 102000
####################################
structure $PSFFILE
coordinates $PDBFILE
paraTypeCharmm on
parameters par_all36_prot.prm
if {[info exists INPUTNAME]} {
  BinVelocities $INPUTNAME.restart.vel
  BinCoordinates $INPUTNAME.restart.coor
} else {
  temperature $ITEMP
}

??
Yan Zhang$B!$(B
Associate Professor$B!$(B
Institute of Biophysics,
Chinese Academy of Sciences

??
Yan Zhang$B!$(B
Associate Professor$B!$(B
Institute of Biophysics,
Chinese Academy of Sciences

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2019 - 23:20:26 CST