Re: Issue regarding to the speed of QM/MM

From: Alex Balaeff (abalaeff_at_polarisqb.com)
Date: Mon Sep 07 2020 - 13:36:29 CDT

Thanks a lot for your comments Marcelo. Throwing in my 2 cents (in
hope to be criticized if these are wrong cents :) : there are
situations when using every thread makes sense.

For example, say, I need to run 112 similar jobs on the CPU cores in
question. And let's say the performance of 1 job per thread is 50%
worth than that of 1 job per core.

In that case, option 1 is to run two successive batches of 56 jobs
each. If a job takes time T, my whole simulation takes 2T.

Option 2 is to run all 112 jobs simultaneously. They will finish in
1.5*T -- still better than the 2T timing of option 1??

Best,

Alexander.

On Mon, Sep 7, 2020 at 2:13 PM Marcelo C. R. Melo <melomcr_at_gmail.com> wrote:
>
> Hi Zhihong,
>
> The performance of a QM/MM simulation will (almost always) be determined by the performance of the QM calculation itself. In this case, you are using ORCA to run DFT using 4 CPU cores (by asking for "PAL4").
>
> In QM calculations, it is important to know what is the size of the QM region, that is, how many atoms are in the QM region? 10 atoms, 100 atoms? This will make a gigantic difference in performance.
>
> The best bet for you is to balance the number of cores dedicated to NAMD with the number of cores dedicated to ORCA, and absolutely never overlap the CPU cores for both.
> Something else that has been discussed in this list extensively is the use of hiperthreading. In your example, since you have two 28-core CPUs, you should only allocate a total of 56 processes between NAMD and ORCA, no more than that. Using all the 112 threads will probably lead to terrible performance.
>
> I would suggest starting with 10 cores for NAMD and 46 for ORCA. (I am assuming based on your performance that you have many atoms in your QM region, which will benefit from more CPU cores).
> You will need to use ORCA's long format for parallelism instead of using "PAL4", and I see you already have a line like that in your NAMD config file asking for 10 cores.
> Try benchmarking the ratio of NAMD/ORCA CPU cores, and do not exceed 56 (or maybe 54, to leave a couple of cores for the OS, since you are running in a workstation).
>
> Best,
> Marcelo
>
> On Mon, 7 Sep 2020 at 04:42, 辛志宏 <xzhfood_at_njau.edu.cn> wrote:
>>
>> Dear all,
>>
>> I am running a enzyme complex (298 amino acid and 1 ligand and 90 thousand water molecules ) molecular dynamic simulation by QM/MM using NAMD, but it is very slowly with which only 25 steps being done every day (24 hours) in a
>>
>> minimization simulation (minimize 100, run 2000), I wonder if there are some isses regarding to the parameters of config file, any suggestion to improve the speed for running QM/MM will be much appreciated.
>>
>>
>> The hardware for my computer (8173M workstation) is fine with 384GB memory and two physical memory (28 core per CPU, and 112 threads) , the command is as follows:
>>
>>
>> charmrun ++local +p20 +isomalloc_sync namd2 YZZ-config.ORCA-1.namd | tee YZZ-config.ORCA-1.namd.log
>>
>>
>> Thank you in advance.
>>
>>
>> Zhihong Xin,
>>
>>
>>
>> The config file is as follows:
>>
>> ## Single QM region with MM water box
>>
>> structure ionized.psf
>>
>> coordinates ionized.pdb
>>
>> #Continuing a job from the restart files
>>
>> if {1} {
>>
>> set inputname YZZ_equil_MM
>>
>> binCoordinates $inputname.coor
>>
>> extendedSystem $inputname.xsc
>>
>> }
>>
>> cellBasisVector1 64.945 0 0
>>
>> cellBasisVector2 0 65.353 0
>>
>> cellBasisVector3 0 0 67.919
>>
>> cellOrigin 55.318 57.874 55.561
>>
>> seed 7910881
>>
>> # Output Parameters
>>
>> binaryoutput no
>>
>> outputname YZZ-QM-min-out
>>
>> outputenergies 1
>>
>> outputtiming 1
>>
>> outputpressure 1
>>
>> binaryrestart yes
>>
>> dcdfile YZZ-QM-min-out.dcd
>>
>> dcdfreq 1
>>
>> XSTFreq 1
>>
>> restartfreq 100
>>
>> restartname YZZ-QM-min-out.restart
>>
>> # mobile atom selection:
>>
>> constraints on
>>
>> consexp 2
>>
>> consref YZZ-restraint.pdb
>>
>> conskfile YZZ-restraint.pdb
>>
>> conskcol B
>>
>> constraintScaling 2.0
>>
>> # PME Parameters
>>
>> PME on
>>
>> PMEGridspacing 1
>>
>> set temperature 300
>>
>> temperature $temperature
>>
>> # Thermostat Parameters
>>
>> langevin on
>>
>> langevintemp $temperature
>>
>> langevinHydrogen on
>>
>> langevindamping 50
>>
>> # Barostat Parameters
>>
>> usegrouppressure yes
>>
>> useflexiblecell no
>>
>> useConstantArea no
>>
>> langevinpiston on
>>
>> langevinpistontarget 1.01325
>>
>> langevinpistonperiod 200
>>
>> langevinpistondecay 100
>>
>> langevinpistontemp $temperature
>>
>> surfacetensiontarget 0.0
>>
>> strainrate 0. 0. 0.
>>
>> wrapAll on
>>
>> wrapWater on
>>
>> # Integrator Parameters
>>
>> timestep 0.5
>>
>> firstTimestep 0
>>
>> fullElectFrequency 1
>>
>> nonbondedfreq 1
>>
>> # Force Field Parameters
>>
>> paratypecharmm on
>>
>> parameters ../CHARMpars/toppar_all36_carb_glycopeptide.str
>>
>> parameters ../CHARMpars/toppar_water_ions_namd.str
>>
>> parameters ../CHARMpars/toppar_all36_na_nad_ppi_gdp_gtp.str
>>
>> parameters ../CHARMpars/par_all36_carb.prm
>>
>> parameters ../CHARMpars/par_all36_cgenff.prm
>>
>> parameters ../CHARMpars/par_all36_lipid.prm
>>
>> parameters ../CHARMpars/par_all36_na.prm
>>
>> parameters ../CHARMpars/par_all36_prot.prm
>>
>> parameters ../common/DMP_ABD769.prm
>>
>> #printExclusions on
>>
>> exclude scaled1-4
>>
>> 1-4scaling 1.0
>>
>> rigidbonds none
>>
>> cutoff 12.0
>>
>> pairlistdist 14.0
>>
>> switching on
>>
>> switchdist 10.0
>>
>> stepspercycle 1
>>
>> # Truns ON or OFF the QM calculations
>>
>> qmForces on
>>
>> qmParamPDB "YZZ-namd-QM-0.pdb"
>>
>> qmColumn "beta"
>>
>> qmBondColumn "occ"
>>
>> #Link Atoms
>>
>> qmBondDist on
>>
>> # Number of simultaneous QM simulations per node
>>
>> QMSimsPerNode 20
>>
>> QMElecEmbed on
>>
>> QMSwitching on
>>
>> QMSwitchingType shift
>>
>> QMPointChargeScheme none
>>
>> QMBondScheme "cs"
>>
>> #qmBaseDir "/dev/shm/YZZ-NAMD_MIN"
>>
>> # Directory where QM calculations will be ran.
>>
>> qmBaseDir "/dev/shm/NAMD_Example1"
>>
>> ## ORCA
>>
>> qmConfigLine "! B3LYP 6-31G Grid4 PAL4 EnGrad TightSCF"
>>
>> qmConfigLine "%%output PrintLevel Mini Print\[ P_Mulliken \] 1 Print\[P_AtCharges_M\] 1 end"
>>
>> #qmConfigLine "%%pal nprocs 10 end"
>>
>> # construction of ORCA's input file.
>>
>> qmMult "1 2"
>>
>> qmCharge "1 -1"
>>
>> qmSoftware "orca"
>>
>> qmExecPath "/home/xzhfood/software/orca_4_1_2_linux_x86-64_openmpi313/orca"
>>
>> QMOutStride 1
>>
>> QMPositionOutStride 1
>>
>> # Number of steps in the QM/MM simulation.
>>
>> minimize 100
>>
>> run 2000
>>
>>

-- 
 -----
  Dr. Alexander Balaeff
  Polaris Quantum Biotech
  www.PolarisQB.com
  (919)-270-5772

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2020 - 23:17:14 CST