Re: [External] running NAMD with Slurm

From: Jim Phillips (jim_at_ks.uiuc.edu)
Date: Mon Jan 14 2019 - 14:29:55 CST

Since 2010 the charmrun binary included with non-MPI, non-multicore builds
of NAMD has been able to leverage the existing mpiexec or other system
launch facilities directly. Details are in notes.txt and it depends how
you would normally launch MPI jobs, but it could be as simple as:

   /path/to/charmrun +p<procs> ++mpiexec /path/to/namd2 ...

or maybe:

   charmrun +p<procs> ++mpiexec ++remote-shell srun namd2 ...

This has the advantage of not needing to ssh between nodes and integration
with the queueing system to reliably kill processes when the jobs ends.

Jim

On Wed, 9 Jan 2019, Bennion, Brian wrote:

> hello
>
> If you use the mpi aware namd executable all the searching for hostnames can be handled by srun just using:
>
> srun -N 4 -n 96 namd2 blah.....
>
>
> Brian
>
>
> ________________________________
> From: owner-namd-l_at_ks.uiuc.edu <owner-namd-l_at_ks.uiuc.edu> on behalf of Sharp, Kim <sharpk_at_pennmedicine.upenn.edu>
> Sent: Wednesday, January 9, 2019 9:15:19 AM
> To: namd-l_at_ks.uiuc.edu; Seibold, Steve Allan
> Subject: Re: [External] namd-l: running NAMD with Slurm
>
> Steve,
> details might differ as I am running a different version of namd, and
> hardware is obviously different.
>
> here is a slurm script we use on our cluster:
>
> -----------------------------------------------
> #!/bin/csh
> #SBATCH --mail-type=ALL
> #SBATCH --partition=namd
> #SBATCH --nodes=4
> #SBATCH --ntasks=96
> echo 'nodes: ' $SLURM_NNODES 'tasks/node: ' $SLURM_TASKS_PER_NODE 'total
> tasks: ' $SLURM_NTASKS
> set WORKDIR=/home/sharp/work/c166/
> cd $WORKDIR
> module load namd
> make_namd_nodelist
> charmrun ++nodelist nodelist.$SLURM_JOBID ++p $SLURM_NTASKS `which
> namd2` +setcpuaffinity c166_s2000.conf > sharp_3Dec2018_.log
> ---------------------------------------------------------
>
> namd is launched via charmrun with the
> ++nodelist hostnamefile
> option.
>
> this hostnamefile contains lines like:
>
> host node023 ++cpus 2
> host node024 ++cpus 2
>
> for however many nodes you requested. It is generated at the time slurm
> starts your job, because only at that time does slurm know the names of
> the nodes it is allocating to this job. you get this node list by
> executing the slurm command
>
> scontrol show hostnames
>
> in your job script and capturing/reformatting the output into the
> nodenamelist file. here we have a script make_namd_nodelist that does
> this, you can see it is executed right before the namd command
>
> best
> Kim
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On 1/9/19 10:54 AM, Seibold, Steve Allan wrote:
>> Thanks Kim for your response. Here is my Slurm script as follows:
>>
>>
>> =====================================================
>>
>>
>> #!/bin/bash
>>
>> #SBATCH --job-name=Seibold # Job name
>> #SBATCH --partition=mridata # Partition Name (Required)
>> #SBATCH --mail-type=END,FAIL # Mail events (NONE, BEGIN, END, FAIL, ALL)
>> #SBATCH --mail-user=stevesei_at_ku.edu # Where to send mail
>> #SBATCH --nodes=2 --ntasks-per-node=15 --mem-per-cpu=250M --time=12:00:00
>> #SBATCH --output=md7_3BP_%j.log # Standard output and error log
>>
>> pwd; hostname; date
>>
>> #module load namd/2.12_multicore
>>
>> echo "Running on $SLURM_CPUS_ON_NODE cores"
>>
>>
>> ~/NAMD2/NAMD_2.13_Linux-x86_64/namd2 md7_3BP.namd
>>
>> ===========================================================
>>
>>
>> Thanks, Steve
>>
>
> --
> Kim Sharp, Ph.D,
> Dept. of Biochemistry & Biophysics
> Chair, Biochemistry & Molecular Biophysics Graduate Group
> Perelman School of Medicine at the University of Pennsylvania
> 805A Stellar Chance
> Philadelphia, PA 19104
> webpage: crystal.med.upenn.edu
> 215-573-3506
>
>

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2019 - 23:20:26 CST