RE: MPIRUN SLURM SCRIPT

From: Bennion, Brian (bennion1_at_llnl.gov)
Date: Tue May 09 2017 - 13:07:08 CDT

Ntasks = number of mpi tasks per node. If OMP_NUM_THREADS=1 then you have X mpi tasks per node with 1 thread per task.

Brian

From: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] On Behalf Of Zeki Zeybek
Sent: Tuesday, May 09, 2017 04:02
To: Susmita Ghosh <g.susmita6_at_gmail.com>; namd-l_at_ks.uiuc.edu
Subject: Re: namd-l: MPIRUN SLURM SCRIPT

what does ntasks stand for ?

________________________________
From: Susmita Ghosh <g.susmita6_at_gmail.com<mailto:g.susmita6_at_gmail.com>>
Sent: 09 May 2017 13:39:23
To: namd-l_at_ks.uiuc.edu<mailto:namd-l_at_ks.uiuc.edu>; Zeki Zeybek
Subject: Re: namd-l: MPIRUN SLURM SCRIPT

Dear Zeki,
If you are using single job on mutiple node, then I think you should use "export OMP_NUM_THREADS=1". I have given an example in the following SBATCH script:

#!/bin/sh
#BATCH -A Name
#SBATCH -N 4
#SBATCH --ntasks-per-node=24
#SBATCH --error=job.%J.err
#SBATCH --output=job.%J.out
#SBATCH --time=1-02:00:00
module load slurm
### To launch mpi job through srun user have to export the below library file##
export I_MPI_PMI_LIBRARY=/cm/shared/apps/slurm/15.08.13/lib64/libpmi.so
export LD_LIBRARY_PATH=/home/cdac/jpeg/lib:$LD_LIBRARY_PATH
#export PATH=/home/cdac/lammps_10oct2014/bin:$PATH
export OMP_NUM_THREADS=1
time srun -n 96 namd2 eq_NVT_run1.conf > eq_NVT_run1.log

Susmita Ghosh,
Research Scholar,
Department of Physics,
IIT Guwahati, India.

On Tue, May 9, 2017 at 3:38 PM, Zeki Zeybek <zeki.zeybek_at_bilgiedu.net<mailto:zeki.zeybek_at_bilgiedu.net>> wrote:

Hi!

I am trying to come up with a slurm script file for my simulation but I failed miserably. The point is that in the uni. super computer for a single node there exist 20 cores. What I want to do is for my single job, lets say aaaa.conf, I want to use 80 cores. However to allocate such numbers of cores I need to use 4 nodes (4*20=80). However slurm gives error if I try to run one task on multiple nods. How can I overcome this situation ?

#!/bin/bash
#SBATCH --clusters=AA
#SBATCH --account=AA
#SBATCH --partition=AA
#SBATCH --job-name=AA
#SBATCH --nodes=4
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=80
#SBATCH --time=120:00:00

source /truba/sw/centos6.4/comp/intel/bin/compilervars.sh intel6i4
source /truba/sw/centos6.4/lib/impi/4.1.1.036/bin64/mpivars.sh<http://secure-web.cisco.com/1Q_oqn3li3s4hep9o9eTSxs1TRNtSjwIgvChrG6QB_9HhNLm5317tIS3JJE8aA_ksgExVQ3Zt947u5pJkK5oqNekJSc5y4ZpXYq1akhoCfOZAY4eZAS2PpCOGziIJn_fu1Mnuoe2GD3wl4L2T_896kn8j4R5F_KLNM-z5yU-rxi7uk2prZYZsg3iROUA04PAn3k60iF19rL5a8c3uMNIDJYOgCn7smb0j4S-U5sYNxoEcik-TQDeATkBEoTpBVO9S4v1HL5DfWgRTZyDnhM2s42eLf7fUQnmWK_7xgYTXOV5W2r0_mD0ixUaELL-8OPOL7YIAHsNPaJ99xcpcorhd1KOehSDuvCNbBv84E0qhwyY/http%3A%2F%2F4.1.1.036%2Fbin64%2Fmpivars.sh>
module load centos6.4/app/namd/2.9-multicore
module load centos6.4/lib/impi/4.1.1

export OMP_NUM_THREADS=20
echo "SLURM_NODELIST $SLURM_NODELIST"
echo "NUMBER OF CORES $SLURM_NTASKS"

#$NAMD_DIR/namd2 +p$OMP_NUM_THREADS namd_input.conf > namd_multinode_output.log for single node

#$mpirun NAMD_DIR/namd2 +p$OMP_NUM_THREADS namd_input.conf > namd_multinode_output.log for multiple node

THE ABOVE SCRIPT GIVES ERROR as in --ntasks=1 is not valid. However if I make --ntasks=4 and --cpus-per-task=20 it works. But it does not enhance the run speed. (Note: each user can use at most 80 cores in the super computer server)

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2018 - 23:20:17 CST