Re: Running QM-MM MOPAC on a cluster

From: Jim Phillips (jim_at_ks.uiuc.edu)
Date: Tue Dec 18 2018 - 16:03:22 CST

I suggest you ask your cluster admin for help in checking on what your
NAMD and MOPAC processes are doing during. If you do not see them in top
then there is a problem, or you are connected to the wrong host.

Jim

On Mon, 17 Dec 2018, Francesco Pietra wrote:

> With
> #SBATCH --ntasks=1
> #SBATCH --cpus-per-task=34
>
> 0.3/step (at 11min executed 2080 steps), i.e. somewhat faster than with
> previously used
>
> #SBATCH --ntasks=10
> #SBATCH --cpus-per-task=1
>
> 0.5/step (at 11min executed 1346 steps)
>
> however still far from the expected performance for a 36-core node.
>
> I can ssh to the cluster as I have the pass, but I was unable to do better
> than issuing the "top" command, either from my home or by lodging into my
> running work. In both cases, the last line (COMMAND) reports process "top"
> and a ca 1% cpu usage. Even by looking at the other line, for other users
> or root, I never detected either namd or mopac under COMMAND, only python
> or other tools.
>
> I forgot to look into the /0/.arc file of mopac but I could repeat the
> simulations to this end if you think that useful.
>
> Thanks
>
> fp
>
>
> On Mon, Dec 17, 2018 at 4:12 PM Jim Phillips <jim_at_ks.uiuc.edu> wrote:
>
>>
>> Since you are asking Slurm for 10 tasks with 1 cpu-per-task it is possible
>> that all 34 threads are running on a single core. You can check this with
>> top (hit "1" to see per-core load) if you can ssh to the execution host.
>>
>> You should probably request --ntasks=1 --cpus-per-task=34 (or 36) so that
>> Slurm will allocate all of the cores you wish to use. The number of cores
>> used by NAMD is controlled by +p10 and you will need THREADS=24 for MOPAC.
>>
>> It is a good idea to use top to confirm that all cores are being used.
>>
>> Jim
>>
>>
>> On Sun, 16 Dec 2018, Francesco Pietra wrote:
>>
>>> I had early taken into consideration the relative nr of threads, by
>>> imposing them also to MOPAC.
>>> Out of the many such trials, namd.config:
>>>
>>> qmConfigLine "PM7 XYZ T=2M 1SCF MOZYME CUTOFF=9.0 AUX LET GRAD QMMM
>>> GEO-OK THREADS=24"
>>>
>>> qmExecPath "/galileo/home/userexternal/fpietra0/mopac/MOPAC2016.exe"
>>>
>>> corresponding SLURM:
>>> #SBATCH --nodes=1
>>> #SBATCH --ntasks=10
>>> #SBATCH --cpus-per-task=1
>>>
>> /galileo/home/userexternal/fpietra0/NAMD_Git-2018-11-22_Linux-x86_64-multicore/namd2
>>> namd-01.conf +p10 > namd-01.log
>>>
>>> Thus, 24+10=34, while the number of cores on the node was 36. Again,
>>> execution took nearly two hours, slower than on my vintage VAIO with two
>>> cores (1hr and half).
>>>
>>> As to the MKL_NUM_THREADS, I am lost, there is no such environment
>> variable
>>> in MOPAC's list. On the other hand, the namd night build I used performs
>> as
>>> effective as it should with classical MD simulations on one node of the
>>> same cluster.
>>>
>>> thanks
>>> fp
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Dec 14, 2018 at 4:29 PM Jim Phillips <jim_at_ks.uiuc.edu> wrote:
>>>
>>>>
>>>> The performance of a QM/MM simulation is typically limited by the QM
>>>> program, not the MD program. Do you know how many threads MOPAC is
>>>> launching? Do you need to set the MKL_NUM_THREADS environment variable?
>>>> You want the number of NAMD threads (+p#) plus the number of MOPAC
>> threads
>>>> to be less than the number of cores on your machine.
>>>>
>>>> Jim
>>>>
>>>>
>>>> On Fri, 14 Dec 2018, Francesco Pietra wrote:
>>>>
>>>>> Hi all
>>>>> I resumed my attempts at finding the best settings for running namd
>> qmmm
>>>> on
>>>>> a cluster. I used Example1, Polyala).
>>>>>
>>>>> In order to use namd2/13 multicore night build, I was limited to a
>> single
>>>>> multicore node, 2*18-core Intel(R) Xeon(R) E5-2697 v4 @ 2.30GHz and
>> 128
>>>>> GB RAM (Broadwell)
>>>>>
>>>>> Settings
>>>>> qmConfigLine "PM7 XYZ T=2M 1SCF MOZYME CUTOFF=9.0 AUX LET GRAD QMMM
>>>>> GEO-OK"
>>>>>
>>>>> qmExecPath
>> "/galileo/home/userexternal/fpietra0/mopac/MOPAC2016.exe"
>>>>>
>>>>> of course, on the cluster the simulation can't be run on shm
>>>>>
>>>>> execution line
>>>>>
>>>>
>> /galileo/home/userexternal/fpietra0/NAMD_Git-2018-11-22_Linux-x86_64-multicore/namd2
>>>>> namd-01.conf +p# > namd-01.log
>>>>>
>>>>> where # was either 4, 10, 15, 36
>>>>>
>>>>> With either 36 or 15 core; segmentation fault
>>>>>
>>>>> With either 4 of 10 core, execution of the 20,000 steps of Example 1
>> took
>>>>> nearly two hours. From the .ou file in folder /0, the execution took
>> 0.18
>>>>> seconds.
>>>>>
>>>>> My question is what is wrong in my attempts to rationalize such
>>>>> disappointing performance.
>>>>>
>>>>> Thanks for advice
>>>>>
>>>>> francesco pietra
>>>>>
>>>>
>>>
>>
>

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2019 - 23:20:24 CST