Re: Running QM-MM MOPAC on a cluster

From: Jim Phillips (jim_at_ks.uiuc.edu)
Date: Wed Jan 02 2019 - 16:40:46 CST

For starters, use the faster settings from the previous emails:

> #SBATCH --ntasks=1
> #SBATCH --cpus-per-task=34

For a little more information add +showcpuaffinity.

I suspect that +setcpuaffinity isn't looking at the limits on affinity
that are enforced by the queueing system, so it's trying to use a
forbidded cpu. If you request all cores on the node with
--cpus-per-task=36 that might make the problem go away.

Jim

On Tue, 1 Jan 2019, Francesco Pietra wrote:

> Thanks a lot for these suggestions. There must be some restriction
> hindering the suggested settings. Slurm, namd-01.conf, and error are shown
> below in the given order:
>
> #!/bin/bash
> #SBATCH --nodes=1
> #SBATCH --ntasks=10
> #SBATCH --cpus-per-task=1
> #SBATCH --time=00:30:00
> #SBATCH --job-name=namd-01
> #SBATCH --output namd-01.out
> #SBATCH --error namd-01.err
> #SBATCH --partition=gll_usr_prod
> #SBATCH --mem=115GB
> #SBATCH --account=IscrC_QMMM-FER_1
> # goto launch directory
> cd $SLURM_SUBMIT_DIR
> /galileo/home/userexternal/fpietra0/NAMD_Git-2018-11-22_Linux-x86_64-multicore/namd2
> namd-01.conf +p10 +setcpuaffinity > namd-01.log
>
> qmExecPath "numactl -C +10-33
> /galileo/home/userexternal/fpietra0/mopac/MOPAC2016.exe"
>
> $ cat *err
> pthread_setaffinity: Invalid argument
> pthread_setaffinity: Invalid argument
> pthread_setaffinity: Invalid argument
> ------------- Processor 7 Exiting: Called CmiAbort ------------
> Reason: set cpu affinity abort!
>
> Charm++ fatal error:
> set cpu affinity abort!
>
> /var/spool/slurmd/job540826/slurm_script: line 14: 21114 Segmentation
> fault
> /galileo/home/userexternal/fpietra0/NAMD_Git-2018-11-22_Linux-x86_64-multicore/namd2
> namd-01.conf +p10 +setcpuaffinity > namd-01.log
>
> fp
>
> On Mon, Dec 31, 2018 at 4:42 PM Jim Phillips <jim_at_ks.uiuc.edu> wrote:
>
>>
>> Well, that's progress at least. I have one other idea to ensure that NAMD
>> and MOPAC aren't competing with each other for the same cores:
>>
>> 1) Add "+setcpuaffinity" to the NAMD command line before ">".
>>
>> 2) Add "numactl -C +10-33" to the beginning of qmExecPath in namd-01.conf
>> (quote the string, e.g., "numactl -C +10-33 /path/to/MOPAC.exe")
>>
>> This should keep NAMD on your first ten cores and MOPAC on the next 24.
>>
>> What is qmBaseDir set to? Something in /dev/shm is the best choice. If
>> qmBaseDir is on a network filesystem that could slow things down.
>>
>> Jim
>>
>>
>> On Fri, 21 Dec 2018, Francesco Pietra wrote:
>>
>>> I finally learned how to ssh on a given node. The results for
>>> #SBATCH --nodes=1
>>> #SBATCH --ntasks=10
>>> #SBATCH --cpus-per-task=1
>>>
>> /galileo/home/userexternal/fpietra0/NAMD_Git-2018-11-22_Linux-x86_64-multicore/namd2
>>> namd-01.conf +p10 > namd-01.log
>>>
>>> qmConfigLine "PM7 XYZ T=2M 1SCF MOZYME CUTOFF=9.0 AUX LET GRAD QMMM
>>> GEO-OK THREADS=24"
>>>
>>> are
>>>
>>> ssh node181
>>> namd %cpu 720-750
>>> mopac %cpu 1-30
>>> 1 (per-core load):
>>> %Cpu0-4: 90-100
>>> %Cpu18-22: 60-100
>>> %Cpu5-17: 0.0
>>> %Cpu23-34: 0.0
>>>
>>> namd.log: 0.5/step (at 11min executed 1346 steps)
>>> ______________________
>>> As above, only changing
>>>
>>> SBATCH --nodes=1
>>> #SBATCH --ntasks=1
>>> #SBATCH --cpus-per-task=34
>>>
>>> ssh node181
>>> namd %cpu 900
>>> mopac %cpu 0-34
>>> 1
>>> %Cpu0-34: 0.3-100.0
>>>
>>> namd.log: 0.3/step (at 11min executed 2080 steps)
>>>
>>> Despite all cpus used, disappointing performance. I can't say whether
>> namd
>>> and mopac compete, at least in part, for the same cores.
>>>
>>> francesco
>>>
>>>
>>> On Mon, Dec 17, 2018 at 4:12 PM Jim Phillips <jim_at_ks.uiuc.edu> wrote:
>>>
>>>>
>>>> Since you are asking Slurm for 10 tasks with 1 cpu-per-task it is
>> possible
>>>> that all 34 threads are running on a single core. You can check this
>> with
>>>> top (hit "1" to see per-core load) if you can ssh to the execution host.
>>>>
>>>> You should probably request --ntasks=1 --cpus-per-task=34 (or 36) so
>> that
>>>> Slurm will allocate all of the cores you wish to use. The number of
>> cores
>>>> used by NAMD is controlled by +p10 and you will need THREADS=24 for
>> MOPAC.
>>>>
>>>> It is a good idea to use top to confirm that all cores are being used.
>>>>
>>>> Jim
>>>>
>>>>
>>>> On Sun, 16 Dec 2018, Francesco Pietra wrote:
>>>>
>>>>> I had early taken into consideration the relative nr of threads, by
>>>>> imposing them also to MOPAC.
>>>>> Out of the many such trials, namd.config:
>>>>>
>>>>> qmConfigLine "PM7 XYZ T=2M 1SCF MOZYME CUTOFF=9.0 AUX LET GRAD QMMM
>>>>> GEO-OK THREADS=24"
>>>>>
>>>>> qmExecPath
>> "/galileo/home/userexternal/fpietra0/mopac/MOPAC2016.exe"
>>>>>
>>>>> corresponding SLURM:
>>>>> #SBATCH --nodes=1
>>>>> #SBATCH --ntasks=10
>>>>> #SBATCH --cpus-per-task=1
>>>>>
>>>>
>> /galileo/home/userexternal/fpietra0/NAMD_Git-2018-11-22_Linux-x86_64-multicore/namd2
>>>>> namd-01.conf +p10 > namd-01.log
>>>>>
>>>>> Thus, 24+10=34, while the number of cores on the node was 36. Again,
>>>>> execution took nearly two hours, slower than on my vintage VAIO with
>> two
>>>>> cores (1hr and half).
>>>>>
>>>>> As to the MKL_NUM_THREADS, I am lost, there is no such environment
>>>> variable
>>>>> in MOPAC's list. On the other hand, the namd night build I used
>> performs
>>>> as
>>>>> effective as it should with classical MD simulations on one node of the
>>>>> same cluster.
>>>>>
>>>>> thanks
>>>>> fp
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Dec 14, 2018 at 4:29 PM Jim Phillips <jim_at_ks.uiuc.edu> wrote:
>>>>>
>>>>>>
>>>>>> The performance of a QM/MM simulation is typically limited by the QM
>>>>>> program, not the MD program. Do you know how many threads MOPAC is
>>>>>> launching? Do you need to set the MKL_NUM_THREADS environment
>> variable?
>>>>>> You want the number of NAMD threads (+p#) plus the number of MOPAC
>>>> threads
>>>>>> to be less than the number of cores on your machine.
>>>>>>
>>>>>> Jim
>>>>>>
>>>>>>
>>>>>> On Fri, 14 Dec 2018, Francesco Pietra wrote:
>>>>>>
>>>>>>> Hi all
>>>>>>> I resumed my attempts at finding the best settings for running namd
>>>> qmmm
>>>>>> on
>>>>>>> a cluster. I used Example1, Polyala).
>>>>>>>
>>>>>>> In order to use namd2/13 multicore night build, I was limited to a
>>>> single
>>>>>>> multicore node, 2*18-core Intel(R) Xeon(R) E5-2697 v4 @ 2.30GHz and
>>>> 128
>>>>>>> GB RAM (Broadwell)
>>>>>>>
>>>>>>> Settings
>>>>>>> qmConfigLine "PM7 XYZ T=2M 1SCF MOZYME CUTOFF=9.0 AUX LET GRAD
>> QMMM
>>>>>>> GEO-OK"
>>>>>>>
>>>>>>> qmExecPath
>>>> "/galileo/home/userexternal/fpietra0/mopac/MOPAC2016.exe"
>>>>>>>
>>>>>>> of course, on the cluster the simulation can't be run on shm
>>>>>>>
>>>>>>> execution line
>>>>>>>
>>>>>>
>>>>
>> /galileo/home/userexternal/fpietra0/NAMD_Git-2018-11-22_Linux-x86_64-multicore/namd2
>>>>>>> namd-01.conf +p# > namd-01.log
>>>>>>>
>>>>>>> where # was either 4, 10, 15, 36
>>>>>>>
>>>>>>> With either 36 or 15 core; segmentation fault
>>>>>>>
>>>>>>> With either 4 of 10 core, execution of the 20,000 steps of Example 1
>>>> took
>>>>>>> nearly two hours. From the .ou file in folder /0, the execution took
>>>> 0.18
>>>>>>> seconds.
>>>>>>>
>>>>>>> My question is what is wrong in my attempts to rationalize such
>>>>>>> disappointing performance.
>>>>>>>
>>>>>>> Thanks for advice
>>>>>>>
>>>>>>> francesco pietra
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2020 - 23:17:09 CST