Re: Running QM-MM MOPAC on a cluster

From: Francesco Pietra (chiendarret_at_gmail.com)
Date: Fri Dec 21 2018 - 13:17:13 CST

I finally learned how to ssh on a given node. The results for
#SBATCH --nodes=1
#SBATCH --ntasks=10
#SBATCH --cpus-per-task=1
/galileo/home/userexternal/fpietra0/NAMD_Git-2018-11-22_Linux-x86_64-multicore/namd2
namd-01.conf +p10 > namd-01.log

qmConfigLine "PM7 XYZ T=2M 1SCF MOZYME CUTOFF=9.0 AUX LET GRAD QMMM
GEO-OK THREADS=24"

are

ssh node181
namd %cpu 720-750
mopac %cpu 1-30
1 (per-core load):
%Cpu0-4: 90-100
%Cpu18-22: 60-100
%Cpu5-17: 0.0
%Cpu23-34: 0.0

namd.log: 0.5/step (at 11min executed 1346 steps)
______________________
As above, only changing

SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=34

ssh node181
namd %cpu 900
mopac %cpu 0-34
1
%Cpu0-34: 0.3-100.0

namd.log: 0.3/step (at 11min executed 2080 steps)

Despite all cpus used, disappointing performance. I can't say whether namd
and mopac compete, at least in part, for the same cores.

francesco

On Mon, Dec 17, 2018 at 4:12 PM Jim Phillips <jim_at_ks.uiuc.edu> wrote:

>
> Since you are asking Slurm for 10 tasks with 1 cpu-per-task it is possible
> that all 34 threads are running on a single core. You can check this with
> top (hit "1" to see per-core load) if you can ssh to the execution host.
>
> You should probably request --ntasks=1 --cpus-per-task=34 (or 36) so that
> Slurm will allocate all of the cores you wish to use. The number of cores
> used by NAMD is controlled by +p10 and you will need THREADS=24 for MOPAC.
>
> It is a good idea to use top to confirm that all cores are being used.
>
> Jim
>
>
> On Sun, 16 Dec 2018, Francesco Pietra wrote:
>
> > I had early taken into consideration the relative nr of threads, by
> > imposing them also to MOPAC.
> > Out of the many such trials, namd.config:
> >
> > qmConfigLine "PM7 XYZ T=2M 1SCF MOZYME CUTOFF=9.0 AUX LET GRAD QMMM
> > GEO-OK THREADS=24"
> >
> > qmExecPath "/galileo/home/userexternal/fpietra0/mopac/MOPAC2016.exe"
> >
> > corresponding SLURM:
> > #SBATCH --nodes=1
> > #SBATCH --ntasks=10
> > #SBATCH --cpus-per-task=1
> >
> /galileo/home/userexternal/fpietra0/NAMD_Git-2018-11-22_Linux-x86_64-multicore/namd2
> > namd-01.conf +p10 > namd-01.log
> >
> > Thus, 24+10=34, while the number of cores on the node was 36. Again,
> > execution took nearly two hours, slower than on my vintage VAIO with two
> > cores (1hr and half).
> >
> > As to the MKL_NUM_THREADS, I am lost, there is no such environment
> variable
> > in MOPAC's list. On the other hand, the namd night build I used performs
> as
> > effective as it should with classical MD simulations on one node of the
> > same cluster.
> >
> > thanks
> > fp
> >
> >
> >
> >
> >
> > On Fri, Dec 14, 2018 at 4:29 PM Jim Phillips <jim_at_ks.uiuc.edu> wrote:
> >
> >>
> >> The performance of a QM/MM simulation is typically limited by the QM
> >> program, not the MD program. Do you know how many threads MOPAC is
> >> launching? Do you need to set the MKL_NUM_THREADS environment variable?
> >> You want the number of NAMD threads (+p#) plus the number of MOPAC
> threads
> >> to be less than the number of cores on your machine.
> >>
> >> Jim
> >>
> >>
> >> On Fri, 14 Dec 2018, Francesco Pietra wrote:
> >>
> >>> Hi all
> >>> I resumed my attempts at finding the best settings for running namd
> qmmm
> >> on
> >>> a cluster. I used Example1, Polyala).
> >>>
> >>> In order to use namd2/13 multicore night build, I was limited to a
> single
> >>> multicore node, 2*18-core Intel(R) Xeon(R) E5-2697 v4 @ 2.30GHz and
> 128
> >>> GB RAM (Broadwell)
> >>>
> >>> Settings
> >>> qmConfigLine "PM7 XYZ T=2M 1SCF MOZYME CUTOFF=9.0 AUX LET GRAD QMMM
> >>> GEO-OK"
> >>>
> >>> qmExecPath
> "/galileo/home/userexternal/fpietra0/mopac/MOPAC2016.exe"
> >>>
> >>> of course, on the cluster the simulation can't be run on shm
> >>>
> >>> execution line
> >>>
> >>
> /galileo/home/userexternal/fpietra0/NAMD_Git-2018-11-22_Linux-x86_64-multicore/namd2
> >>> namd-01.conf +p# > namd-01.log
> >>>
> >>> where # was either 4, 10, 15, 36
> >>>
> >>> With either 36 or 15 core; segmentation fault
> >>>
> >>> With either 4 of 10 core, execution of the 20,000 steps of Example 1
> took
> >>> nearly two hours. From the .ou file in folder /0, the execution took
> 0.18
> >>> seconds.
> >>>
> >>> My question is what is wrong in my attempts to rationalize such
> >>> disappointing performance.
> >>>
> >>> Thanks for advice
> >>>
> >>> francesco pietra
> >>>
> >>
> >
>

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2018 - 23:21:36 CST