Re: NAMD jobs in SLURM environment, not entering queueing system

From: René Hafner TUK (hamburge_at_physik.uni-kl.de)
Date: Mon Jun 28 2021 - 04:03:46 CDT

Next message: René Hafner TUK: "Re: NAMD jobs in SLURM environment, not entering queueing system"
Previous message: Prathit Chatterjee: "NAMD jobs in SLURM environment, not entering queueing system"
In reply to: Prathit Chatterjee: "NAMD jobs in SLURM environment, not entering queueing system"
Next in thread: René Hafner TUK: "Re: NAMD jobs in SLURM environment, not entering queueing system"
Reply: René Hafner TUK: "Re: NAMD jobs in SLURM environment, not entering queueing system"
Maybe reply: Prathit Chatterjee: "Re: NAMD jobs in SLURM environment, not entering queueing system"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Did you actually use a GPU version of NAMD?

You should see this in the logfile.

If you rely on single node GPU runs the precompiled CUDA binaries
should be sufficient.

And do add `+p${SLURM_NTASKS_PER_NODE} +idlepoll` to the namd exec
line below for faster execution.

Kind regards

René

On 6/28/2021 10:54 AM, Prathit Chatterjee wrote:
> Dear Experts,
>
> This is regarding GPU job submission with NAMD, compiled specifically
> for PACE CG force field, with CHARMM-GUI, in SLURM environment.
>
> Kindly see my submit script below:
>
> #!/bin/csh
>
> #
>
> #SBATCH -J PCCG2000
>
> #SBATCH -N 1
>
> #SBATCH -n 1
>
> #SBATCH -p g3090 # Using a 3090 node
>
> #SBATCH --gres=gpu:1# Number of GPUs (per node)
>
> #SBATCH -o output.log
>
> #SBATCH -e output.err
>
>
> # Generated by CHARMM-GUI (https://urldefense.com/v3/__http://www.charmm-gui.org__;!!DZ3fjg!tVBPMP36gvVp2lW_NjE0cxjrDjBZipk3CQw4snzd1WFkRgur9lon9q8YkpiDcegeRA$
> <https://urldefense.com/v3/__http://www.charmm-gui.org__;!!DZ3fjg!uzul5NRVXaxH6jBb2Q9G5YS_oEiOhHy617xQn-c3N4c6mvGLZWo1Ykiz6Ozuiwzv1w$>)
> v3.5
>
> #
>
> # The following shell script assumes your NAMD executable is namd2 and
> that
>
> # the NAMD inputs are located in the current directory.
>
> #
>
> # Only one processor is used below. To parallelize NAMD, use this scheme:
>
> # charmrun namd2 +p4 input_file.inp > output_file.out
>
> # where the "4" in "+p4" is replaced with the actual number of
> processors you
>
> # intend to use.
>
> module load compiler/gcc-7.5.0cuda/11.2mpi/openmpi-4.0.2-gcc-7
>
>
> echo"SLURM_NODELIST $SLURM_NODELIST"
>
> echo"NUMBER OF CORES $SLURM_NTASKS"
>
>
> setequi_prefix = step6.%d_equilibration
>
> setprod_prefix = step7.1_production
>
> setprod_step = step7
>
>
>
> # Running equilibration steps
>
> setcnt= 1
>
> setcntmax = 6
>
>
> while(${cnt}<=${cntmax})
>
> setstep= `printf ${equi_prefix} ${cnt}`
>
> ##/home2/Prathit/apps/NAMD_PACE_Source/Linux-x86_64-g++/charmrun
> /home2/Prathit/apps/NAMD_PACE_Source/Linux-x86_64-g++/namd2
> ${step}.inp > ${step}.out
>
> /home2/Prathit/apps/NAMD_PACE_Source/Linux-x86_64-g++/namd2
> ${step}.inp>${step}.out
>
>
> @ cnt += 1
>
> end
>
>
> ================
>
> While the jobs are getting submitted, these are not entering the
> queueing system, the PIDs of the jobs are invisible with the command
> "*nvidia-smi*", but showing with the "*top*" command inside the gpu node.
>
> Any suggestions in rectifying the current discrepancy will be greatly
> helpful.
>
> Thank you and Regards,
> Prathit
>
>

-- 
--
Dipl.-Phys. René Hafner
TU Kaiserslautern
Germany

Next message: René Hafner TUK: "Re: NAMD jobs in SLURM environment, not entering queueing system"
Previous message: Prathit Chatterjee: "NAMD jobs in SLURM environment, not entering queueing system"
In reply to: Prathit Chatterjee: "NAMD jobs in SLURM environment, not entering queueing system"
Next in thread: René Hafner TUK: "Re: NAMD jobs in SLURM environment, not entering queueing system"
Reply: René Hafner TUK: "Re: NAMD jobs in SLURM environment, not entering queueing system"
Maybe reply: Prathit Chatterjee: "Re: NAMD jobs in SLURM environment, not entering queueing system"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:11 CST