Fwd: Running QM-MM tutorial on a cluster

From: Francesco Pietra (chiendarret_at_gmail.com)
Date: Sun Nov 18 2018 - 00:44:44 CST

In one day on 1 node, 36 tasks, 1 cpu per task, qm calculations were still
running. I am posting the complete namd log. It seems to me that also orca
is working mpi, like namd, so that, in principle, I could raise the number
of nodes. The systems in my project are heavier than Example1.

Charm++> Running on MPI version: 3.0
Charm++> level of thread support used: MPI_THREAD_SINGLE (desired:
MPI_THREAD_SI
NGLE)
Charm++> Running in non-SMP mode: numPes 36
Charm++> Using recursive bisection (scheme 3) for topology aware partitions
Converse/Charm++ Commit ID:
v6.7.1-0-gbdf6a1b-namd-charm-6.7.1-build-2016-Nov-07
-136676
Warning> Randomization of stack pointer is turned on in kernel, thread
migration
 may not work! Run 'echo 0 > /proc/sys/kernel/randomize_va_space' as root
to dis
able it, or try run with '+isomalloc_sync'.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (36-way SMP).
Charm++> cpu topology info is gathered in 0.048 seconds.
Info: NAMD 2.12 for Linux-x86_64-MPI
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: for updates, documentation, and support information.
Info:
Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 60701 for mpi-linux-x86_64
Info: Built mar 7 mar 2017, 17.38.45, CET by propro01 on node165
Info: 1 NAMD 2.12 Linux-x86_64-MPI 36 node134 fpietra0
Info: Running on 36 processors, 36 nodes, 1 physical nodes.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at 0.0843871 s
Info: 694.914 MB of memory in use based on /proc/self/stat
Info: Configuration file is namd_ORCA-01.conf
Info: Working in the current directory
/gpfs/scratch/userexternal/fpietra0/QM-MM/NAMD_Example1_ORCA_24h_100GB_1node
TCL: Suspending until startup complete.
Info: SIMULATION PARAMETERS:
Info: TIMESTEP 0.5
Info: NUMBER OF STEPS 0
Info: STEPS PER CYCLE 1
Info: PERIODIC CELL BASIS 1 29 0 0
Info: PERIODIC CELL BASIS 2 0 34 0
Info: PERIODIC CELL BASIS 3 0 0 28
Info: PERIODIC CELL CENTER -0.021 0.008 0.108
Info: WRAPPING WATERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: LOAD BALANCER Centralized
Info: LOAD BALANCING STRATEGY New Load Balancers -- DEFAULT
Info: LDB PERIOD 200 steps
Info: FIRST LDB TIMESTEP 5
Info: LAST LDB TIMESTEP -1
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: MIN ATOMS PER PATCH 40
Info: INITIAL TEMPERATURE 300
Info: CENTER OF MASS MOVING INITIALLY? NO
Info: DIELECTRIC 1
Info: EXCLUDE SCALED ONE-FOUR
Info: 1-4 ELECTROSTATICS SCALED BY 1
Info: MODIFIED 1-4 VDW PARAMETERS WILL BE USED
Info: DCD FILENAME PolyAla_out.dcd
Info: DCD FREQUENCY 1
Info: DCD FIRST STEP 1
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: XST FILENAME PolyAla_out.xst
Info: XST FREQUENCY 1
Info: NO VELOCITY DCD OUTPUT
Info: NO FORCE DCD OUTPUT
Info: OUTPUT FILENAME PolyAla_out
Info: RESTART FILENAME PolyAla_out.restart
Info: RESTART FREQUENCY 100
Info: BINARY RESTART FILES WILL BE USED
Info: SWITCHING ACTIVE
Info: SWITCHING ON 10
Info: SWITCHING OFF 12
Info: PAIRLIST DISTANCE 14
Info: PAIRLIST SHRINK RATE 0.01
Info: PAIRLIST GROW RATE 0.01
Info: PAIRLIST TRIGGER 0.3
Info: PAIRLISTS PER CYCLE 2
Info: PAIRLISTS ENABLED
Info: MARGIN 0.495
Info: HYDROGEN GROUP CUTOFF 2.5
Info: PATCH DIMENSION 16.995
Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
Info: TIMING OUTPUT STEPS 1
Info: PRESSURE OUTPUT STEPS 1
Info: QM FORCES ACTIVE
Info: QM PDB PARAMETER FILE: PolyAla-qm.pdb
Info: QM SOFTWARE: orca
Info: QM ATOM CHARGES FROM QM SOFTWARE: MULLIKEN
Info: QM EXECUTABLE PATH:
/cineca/prod/opt/applications/orca/4.0.1/binary/bin/orca
Info: QM COLUMN: beta
Info: QM BOND COLUMN: occ
Info: QM WILL DETECT BONDS BETWEEN QM AND MM ATOMS.
Info: QM-MM BOND SCHEME: Charge Shift.
Info: QM BASE DIRECTORY:
/gpfs/scratch/userexternal/fpietra0/QM-MM/NAMD_Example1_ORCA_24h_100GB_1node
Info: QM CONFIG LINE: ! B3LYP 6-31G Grid4 PAL4 EnGrad TightSCF
Info: QM CONFIG LINE: %%output PrintLevel Mini Print[ P_Mulliken ] 1
Print[P_AtCharges_M] 1 end
Info: QM POINT CHARGES WILL BE SELECTED EVERY 1 STEPS.
Info: QM Point Charge Switching: ON.
Info: QM Point Charge SCHEME: none.
Info: QM executions per node: 1
Info: LANGEVIN DYNAMICS ACTIVE
Info: LANGEVIN TEMPERATURE 300
Info: LANGEVIN USING BBK INTEGRATOR
Info: LANGEVIN DAMPING COEFFICIENT IS 50 INVERSE PS
Info: LANGEVIN DYNAMICS APPLIED TO HYDROGENS
Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
Info: TARGET PRESSURE IS 1.01325 BAR
Info: OSCILLATION PERIOD IS 200 FS
Info: DECAY TIME IS 100 FS
Info: PISTON TEMPERATURE IS 300 K
Info: PRESSURE CONTROL IS GROUP-BASED
Info: INITIAL STRAIN RATE IS 0 0 0
Info: CELL FLUCTUATION IS ISOTROPIC
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCE 1e-06
Info: PME EWALD COEFFICIENT 0.257952
Info: PME INTERPOLATION ORDER 4
Info: PME GRID DIMENSIONS 32 36 28
Info: PME MAXIMUM GRID SPACING 1
Info: Attempting to read FFTW data from system
Info: Attempting to read FFTW data from
FFTW_NAMD_2.12_Linux-x86_64-MPI_FFTW3.txt
Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
Info: Writing FFTW data to FFTW_NAMD_2.12_Linux-x86_64-MPI_FFTW3.txt
Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 1
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: RANDOM NUMBER SEED 7910881
Info: USE HYDROGEN BONDS? NO
Info: COORDINATE PDB PolyAla.pdb
Info: STRUCTURE FILE PolyAla.psf
Info: PARAMETER file: CHARMM format!
Info: PARAMETERS CHARMpars/toppar_all36_carb_glycopeptide.str
Info: PARAMETERS CHARMpars/toppar_water_ions_namd.str
Info: PARAMETERS CHARMpars/toppar_all36_na_nad_ppi_gdp_gtp.str
Info: PARAMETERS CHARMpars/par_all36_carb.prm
Info: PARAMETERS CHARMpars/par_all36_cgenff.prm
Info: PARAMETERS CHARMpars/par_all36_lipid.prm
Info: PARAMETERS CHARMpars/par_all36_na.prm
Info: PARAMETERS CHARMpars/par_all36_prot.prm
Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
Info: SKIPPING rtf SECTION IN STREAM FILE
Info: SKIPPING rtf SECTION IN STREAM FILE
Info: SKIPPING rtf SECTION IN STREAM FILE
Info: SUMMARY OF PARAMETERS:
Info: 937 BONDS
Info: 2734 ANGLES
Info: 6671 DIHEDRAL
Info: 203 IMPROPER
Info: 6 CROSSTERM
Info: 357 VDW
Info: 6 VDW_PAIRS
Info: 0 NBTHOLE_PAIRS
Info: TIME FOR READING PSF FILE: 0.016865
Info: Reading pdb file PolyAla.pdb
Info: TIME FOR READING PDB FILE: 0.00363708
Info:
Info: Using the following PDB file for QM parameters: PolyAla-qm.pdb
Info: Number of QM atoms (excluding Dummy atoms): 20
Info: We found 2 QM-MM bonds.
Info: Applying user defined multiplicity 1 to QM group ID 1
Info: 1) Group ID: 1 ; Group size: 20 atoms ; Total charge: 0
Info: MM-QM pair: 24:30 -> Value (distance or ratio): 1.09 (QM Group 0 ID 1)
Info: MM-QM pair: 50:44 -> Value (distance or ratio): 1.09 (QM Group 0 ID 1)
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 2279 ATOMS
Info: 1546 BONDS
Info: 879 ANGLES
Info: 199 DIHEDRALS
Info: 15 IMPROPERS
Info: 6 CROSSTERMS
Info: 0 EXCLUSIONS
Info: 6837 DEGREES OF FREEDOM
Info: 773 HYDROGEN GROUPS
Info: 4 ATOMS IN LARGEST HYDROGEN GROUP
Info: 773 MIGRATION GROUPS
Info: 4 ATOMS IN LARGEST MIGRATION GROUP
Info: TOTAL MASS = 13773.9 amu
Info: TOTAL CHARGE = 2.98023e-08 e
Info: MASS DENSITY = 0.82848 g/cm^3
Info: ATOM DENSITY = 0.0825485 atoms/A^3
Info: *****************************
Info:
Info: Entering startup at 0.985605 s, 799.488 MB of memory in use
Info: Startup phase 0 took 0.000328064 s, 799.488 MB of memory in use
Info: The QM region will remove 19 bonds, 31 angles, 37 dihedrals, 3
impropers and 1 crossterms.
Info: ADDED 2624 IMPLICIT EXCLUSIONS
Info: Startup phase 1 took 0.00993586 s, 799.621 MB of memory in use
Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
Info: NONBONDED TABLE SIZE: 769 POINTS
Info: INCONSISTENCY IN FAST TABLE ENERGY VS FORCE: 0.000325096 AT 11.9556
Info: INCONSISTENCY IN SCOR TABLE ENERGY VS FORCE: 0.000324844 AT 11.9556
Info: ABSOLUTE IMPRECISION IN VDWA TABLE ENERGY: 4.59334e-32 AT 11.9974
Info: RELATIVE IMPRECISION IN VDWA TABLE ENERGY: 7.4108e-17 AT 11.9974
Info: INCONSISTENCY IN VDWA TABLE ENERGY VS FORCE: 0.0040507 AT 0.251946
Info: ABSOLUTE IMPRECISION IN VDWB TABLE ENERGY: 1.53481e-26 AT 11.9974
Info: RELATIVE IMPRECISION IN VDWB TABLE ENERGY: 7.96691e-18 AT 11.9974
Info: INCONSISTENCY IN VDWB TABLE ENERGY VS FORCE: 0.00150189 AT 0.251946
Info: Startup phase 2 took 0.00565004 s, 803.855 MB of memory in use
Info: Startup phase 3 took 0.000258923 s, 803.855 MB of memory in use
Info: Startup phase 4 took 0.000326157 s, 803.855 MB of memory in use

:Info: Startup phase 5 took 0.000218868 s, 803.855 MB of memory in use
Info: PATCH GRID IS 3 (PERIODIC) BY 4 (PERIODIC) BY 1 (PERIODIC)
Info: PATCH GRID IS 2-AWAY BY 2-AWAY BY 1-AWAY
Info: REMOVING COM VELOCITY -0.188499 0.149382 0.0208025
Info: LARGEST PATCH (5) HAS 198 ATOMS
Info: TORUS A SIZE 36 USING 0
Info: TORUS B SIZE 1 USING 0
Info: TORUS C SIZE 1 USING 0
Info: TORUS MINIMAL MESH SIZE IS 1 BY 1 BY 1
Info: Placed 100% of base nodes on same physical node as patch
Info: Startup phase 6 took 0.0012691 s, 804.398 MB of memory in use
Info: PME using 16 and 18 processors for FFT and reciprocal sum.
Info: PME GRID LOCATIONS: 3 5 7 9 11 13 15 17 19 21 ...
Info: PME TRANS LOCATIONS: 1 2 4 6 8 10 12 14 16 18 ...
Info: PME USING 16 GRID NODES AND 18 TRANS NODES
Info: Startup phase 7 took 0.0920498 s, 804.539 MB of memory in use
Info: Startup phase 8 took 0.000422001 s, 804.539 MB of memory in use
LDB: Central LB being created...
Info: Startup phase 9 took 0.000488043 s, 804.539 MB of memory in use
Info: CREATING 612 COMPUTE OBJECTS
Info: Startup phase 10 took 0.000934124 s, 804.539 MB of memory in use
Info: useSync: 0 useProxySync: 0
Info: Startup phase 11 took 0.00029397 s, 804.539 MB of memory in use
Info: Startup phase 12 took 4.79221e-05 s, 804.539 MB of memory in use
Info: Finished startup at 1.09783 s, 804.539 MB of memory in use

TCL: Minimizing for 100 steps
Info: List of ranks running QM simulations: 2.
.....................................

:

---------- Forwarded message ---------
From: Francesco Pietra <chiendarret_at_gmail.com>
Date: Sat, Nov 17, 2018 at 8:30 PM
Subject: Running QM-MM tutorial on a cluster
To: NAMD <namd-l_at_ks.uiuc.edu>

Hi all
Back to namd qm-mm for a project, after a survey of the tutorial one year
ago. At that time Example1 was run completely with mopac and up to the LINE
MINIMIZER BRACKET step 7 with orca (did not take notice of the time). All
that on a desktop with 4 core, on /dev/shm/ with orca.

Now I am trying the tutorial with orca on a cluster with namd2/12 mpi, in
order to establish how the simulations for the project can be best carried
out. On 4 nodes, 144 core (144 tasks, 1 cpu per task) along 1/2 hour the
simulation was still carrying out qm calculations:

TCL: Minimizing for 100 steps
Info: List of ranks running QM simulations: 2.

It is now running from scratch on a single node, 36 core 110GB mem, 24hr
limit. After12hr, the namd log output is as above.
Is there a way to verify the progress done by orca? Should the allowed
24hr be not enough to arrive at step 1 of the qm calculation, or a minimum
is not reached, how to restart the simulation?

Is the scalability of Example1 known? I.e. how many tasks should be best
used?

Grateful for advice.
francesco pietra

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2018 - 23:21:32 CST