Re: Error while simulating on NAMD

From: René Hafner TUK (hamburge_at_physik.uni-kl.de)
Date: Mon Mar 21 2022 - 17:28:55 CDT

Hi Anirvinya,

your slurm error and NAMD output file tell you everything: "no
CUDA-capable device is detected"

and your SLURM_JOB_GPUS environment variable is empty hence no GPU is
visible to the job.

How do you specifically request a GPU for the job?

Kind regards

René

On 3/21/2022 2:53 PM, Anirvinya Gururajan wrote:
> Hey Josh!
>
> Thanks for the reply. I do ask for GPUs in my batch script (PFA).
> Error dumped by slurm and the NAMD stdout output is attached in the
> previous message.
>
> Regards,
> Anirvinya G
> CCNSB, IIITH
> ------------------------------------------------------------------------
> *From:* Josh Vermaas <vermaasj_at_msu.edu>
> *Sent:* 21 March 2022 18:52
> *To:* namd-l_at_ks.uiuc.edu <namd-l_at_ks.uiuc.edu>; Anirvinya Gururajan
> <anirvinya.gururajan_at_research.iiit.ac.in>
> *Subject:* Re: namd-l: Error while simulating on NAMD
>
> Hi Anirvinya,
>
>
> In your slurm script, are you asking for any GPUs on the nodes? It
> looks like you are using a GPU-accelerated executable, which requires
> a GPU to be present in order to run. With slurm, the typical way to
> ask for GPUs to be allocated to the job is something like '#SBATCH
> --gres=gpu:1'. Do you have a line like that in your submission script?
>
>
> -Josh
>
>
> On 3/20/22 5:16 PM, Anirvinya Gururajan wrote:
>>
>> Hi all,
>>
>> Recently, I have been facing trouble with a very specific system that
>> I am trying to simulate using NAMD/2.13. PFA the slurm output file
>> generated. When I try to simulate it over an interactive job on the
>> cluster node, it seems to work fine. But if it is submitted as a
>> batch job on the same cluster node, it throws the following error. I
>> am not very sure as to where the source of the problem is. The system
>> is large and has about 800k atoms (I don't know how irrelevant it is
>> to this issue).
>>
>> Regards,
>> Anirvinya G
>> CCNSB, IIITH
>>
>>
> --
> Josh Vermaas
> Assistant Professor
> MSU-DOE Plant Research Laboratory, Department of Biochemistry and Molecular Biology
> Michigan State University
> https://urldefense.com/v3/__https://vermaaslab.github.io/__;!!DZ3fjg!sh8inOUkmMLdYXnO-xTdUGuRyD684KZ0VdIH217v4lU7-0F0Wwnh9Q8xzih2m5HqcA$ <https://urldefense.com/v3/__https://vermaaslab.github.io/__;!!DZ3fjg!sahpKfVO-tEtsbSszh9awlxQUC7lG1IfFyszJxkEY79c_Bj40OdOpmWcAcs9HebRKQ$>

-- 
--
Dipl.-Phys. René Hafner
TU Kaiserslautern
Germany

This archive was generated by hypermail 2.1.6 : Tue Dec 13 2022 - 14:32:44 CST