Job submission error for NAMD-2.13 ( version netlrts with cuda-10.0) using Torque job scheduler

From: Daipayan Sarkar (sdaipayan_at_gmail.com)
Date: Tue Nov 19 2019 - 14:49:51 CST

Hello Users,

I have compiled NAMD-2.13 version netlrts with cuda-10.0. The job scheduler
software is torque and have been facing issues for submitting a job. Below
is my NAMD job submission script. I submit the bash script (at the end)
using torque job scheduler:

qsub -l nodes=1:ppn=20:gpus=1,naccesspolicy=singleuser submit.sh

Using "$charm ++local +p20 $namd +idlepoll ++ppn 19 equil.0.inp >
equil.0.log" gives error

--
 ++ppn not recognized
---
Removing +ppn gives the following error:
---
Reason: FATAL ERROR: Number of devices (1) is not a multiple of number of
processes (20).  Sharing devices between processes is inefficient.  Specify
+ignoresharing (each process uses all visible devices) if not all devices
are visible to each process, otherwise adjust number of processes to evenly
divide number of devices, specify subset of devices with +devices argument
(e.g., +devices 0,2), or multiply list shared devices (e.g., +devices
0,1,2,0).
----
Please advice on how to proceed.
==== submit.sh ======
#!/usr/local/bin/bash
namd=$HOMESoftware/NAMD_2.13_Source_netlrts_cuda/Linux-x86_64-g++/namd2
charm=$HOME/Software/NAMD_2.13_Source_netlrts_cuda/Linux-x86_64-g++/charmrun
#$charm ++local +p20 $namd +idlepoll ++ppn 19 equil.0.inp > equil.0.log
$charm ++local +p20 +idlepoll $namd equil.0.inp > equil.0.log
===================
Many thanks,
Daipayan

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2019 - 23:21:02 CST