Is there a problem of ORCA running for NAMD MPI?

From: Francesco Pietra (chiendarret_at_gmail.com)
Date: Tue Nov 20 2018 - 03:35:17 CST

Hi
On running Example1 tutorial QM-MM, I wonder whether there is a problem
with my cluster concerning ORCA running for NAMD MPI: Following failure to
proceed beyond

TCL: Minimizing for 100 steps
> Info: List of ranks running QM simulations: 2
>

on one node, 36 tasks, 1 cpu per task, I am trying on four nodes, 144
tasks, 1 cpu per task, with little hope, giving the small size of Example1.
After 3 hrs, qm is still running. Below the log file . Hope to get some
advice on what I am unable to detect.
francesco pietra
Charm++> Running on MPI version: 3.0
Charm++> level of thread support used: MPI_THREAD_SINGLE (desired:
MPI_THREAD_SINGLE)
Charm++> Running in non-SMP mode: numPes 144
Charm++> Using recursive bisection (scheme 3) for topology aware partitions
Converse/Charm++ Commit ID:
v6.7.1-0-gbdf6a1b-namd-charm-6.7.1-build-2016-Nov-07-136676
Warning> Randomization of stack pointer is turned on in kernel, thread
migration may not work! Run 'echo 0 > /proc/sys/kernel/randomize_va_space'
as root to disable it, or try run with '+isomalloc_sync'.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 4 unique compute nodes (36-way SMP).
Charm++> cpu topology info is gathered in 0.042 seconds.
Info: NAMD 2.12 for Linux-x86_64-MPI
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: for updates, documentation, and support information.
Info:
Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 60701 for mpi-linux-x86_64
Info: Built mar 7 mar 2017, 17.38.45, CET by propro01 on node165
Info: 1 NAMD 2.12 Linux-x86_64-MPI 144 node419 fpietra0
Info: Running on 144 processors, 144 nodes, 4 physical nodes.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at 0.229772 s
Info: 695.176 MB of memory in use based on /proc/self/stat
Info: Configuration file is namd_ORCA-01.conf
Info: Working in the current directory
/gpfs/scratch/userexternal/fpietra0/QM-MM/NAMD_Example1_ORCA_24h_4nodes
TCL: Suspending until startup complete.
Info: SIMULATION PARAMETERS:
Info: TIMESTEP 0.5
Info: NUMBER OF STEPS 0
Info: STEPS PER CYCLE 1
Info: PERIODIC CELL BASIS 1 29 0 0
Info: PERIODIC CELL BASIS 2 0 34 0
Info: PERIODIC CELL BASIS 3 0 0 28
Info: PERIODIC CELL CENTER -0.021 0.008 0.108
Info: WRAPPING WATERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: LOAD BALANCER Centralized
Info: LOAD BALANCING STRATEGY New Load Balancers -- DEFAULT
Info: LDB PERIOD 200 steps
Info: FIRST LDB TIMESTEP 5
Info: LAST LDB TIMESTEP -1
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: REMOVING LOAD FROM NODE 0
Info: REMOVING PATCHES FROM PROCESSOR 0
Info: MIN ATOMS PER PATCH 40
Info: INITIAL TEMPERATURE 300
Info: CENTER OF MASS MOVING INITIALLY? NO
Info: DIELECTRIC 1
Info: EXCLUDE SCALED ONE-FOUR
Info: 1-4 ELECTROSTATICS SCALED BY 1
Info: MODIFIED 1-4 VDW PARAMETERS WILL BE USED
Info: DCD FILENAME PolyAla_out.dcd
Info: DCD FREQUENCY 1
Info: DCD FIRST STEP 1
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: XST FILENAME PolyAla_out.xst
Info: XST FREQUENCY 1
Info: NO VELOCITY DCD OUTPUT
Info: NO FORCE DCD OUTPUT
Info: OUTPUT FILENAME PolyAla_out
Info: RESTART FILENAME PolyAla_out.restart
Info: RESTART FREQUENCY 100
Info: BINARY RESTART FILES WILL BE USED
Info: SWITCHING ACTIVE
Info: SWITCHING ON 10
Info: SWITCHING OFF 12
Info: PAIRLIST DISTANCE 14
Info: PAIRLIST SHRINK RATE 0.01
Info: PAIRLIST GROW RATE 0.01
Info: PAIRLIST TRIGGER 0.3
Info: PAIRLISTS PER CYCLE 2
Info: PAIRLISTS ENABLED
Info: MARGIN 0.495
Info: HYDROGEN GROUP CUTOFF 2.5
Info: PATCH DIMENSION 16.995
Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
Info: TIMING OUTPUT STEPS 1
Info: PRESSURE OUTPUT STEPS 1
Info: QM FORCES ACTIVE
Info: QM PDB PARAMETER FILE: PolyAla-qm.pdb
Info: QM SOFTWARE: orca
Info: QM ATOM CHARGES FROM QM SOFTWARE: MULLIKEN
Info: QM EXECUTABLE PATH:
/cineca/prod/opt/applications/orca/4.0.1/binary/bin/orca
Info: QM COLUMN: beta
Info: QM BOND COLUMN: occ
Info: QM WILL DETECT BONDS BETWEEN QM AND MM ATOMS.
Info: QM-MM BOND SCHEME: Charge Shift.
Info: QM BASE DIRECTORY:
/gpfs/scratch/userexternal/fpietra0/QM-MM/NAMD_Example1_ORCA_24h_100GB_1node
Info: QM CONFIG LINE: ! B3LYP 6-31G Grid4 PAL4 EnGrad TightSCF
Info: QM CONFIG LINE: %%output PrintLevel Mini Print[ P_Mulliken ] 1
Print[P_AtCharges_M] 1 end
Info: QM POINT CHARGES WILL BE SELECTED EVERY 1 STEPS.
Info: QM Point Charge Switching: ON.
Info: QM Point Charge SCHEME: none.
Info: QM executions per node: 1
Info: LANGEVIN DYNAMICS ACTIVE
Info: LANGEVIN TEMPERATURE 300
Info: LANGEVIN USING BBK INTEGRATOR
Info: LANGEVIN DAMPING COEFFICIENT IS 50 INVERSE PS
Info: LANGEVIN DYNAMICS APPLIED TO HYDROGENS
Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
Info: TARGET PRESSURE IS 1.01325 BAR
Info: OSCILLATION PERIOD IS 200 FS
Info: DECAY TIME IS 100 FS
Info: PISTON TEMPERATURE IS 300 K
Info: PRESSURE CONTROL IS GROUP-BASED
Info: INITIAL STRAIN RATE IS 0 0 0
Info: CELL FLUCTUATION IS ISOTROPIC
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCE 1e-06
Info: PME EWALD COEFFICIENT 0.257952
Info: PME INTERPOLATION ORDER 4
Info: PME GRID DIMENSIONS 32 36 28
Info: PME MAXIMUM GRID SPACING 1
Info: Attempting to read FFTW data from system
Info: Attempting to read FFTW data from
FFTW_NAMD_2.12_Linux-x86_64-MPI_FFTW3.txt
Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Done.
Info: Writing FFTW data to FFTW_NAMD_2.12_Linux-x86_64-MPI_FFTW3.txt
Info: FULL ELECTROSTATIC EVALUATION FREQUENCY 1
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: RANDOM NUMBER SEED 7910881
Info: USE HYDROGEN BONDS? NO
Info: COORDINATE PDB PolyAla.pdb
Info: STRUCTURE FILE PolyAla.psf
Info: PARAMETER file: CHARMM format!
Info: PARAMETERS CHARMpars/toppar_all36_carb_glycopeptide.str
Info: PARAMETERS CHARMpars/toppar_water_ions_namd.str
Info: PARAMETERS CHARMpars/toppar_all36_na_nad_ppi_gdp_gtp.str
Info: PARAMETERS CHARMpars/par_all36_carb.prm
Info: PARAMETERS CHARMpars/par_all36_cgenff.prm
Info: PARAMETERS CHARMpars/par_all36_lipid.prm
Info: PARAMETERS CHARMpars/par_all36_na.prm
Info: PARAMETERS CHARMpars/par_all36_prot.prm
Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
Info: SKIPPING rtf SECTION IN STREAM FILE
Info: SKIPPING rtf SECTION IN STREAM FILE
Info: SKIPPING rtf SECTION IN STREAM FILE
Info: SUMMARY OF PARAMETERS:
Info: 937 BONDS
Info: 2734 ANGLES
Info: 6671 DIHEDRAL
Info: 203 IMPROPER
Info: 6 CROSSTERM
Info: 357 VDW
Info: 6 VDW_PAIRS
Info: 0 NBTHOLE_PAIRS
Info: TIME FOR READING PSF FILE: 0.0370231
Info: Reading pdb file PolyAla.pdb
Info: TIME FOR READING PDB FILE: 0.034543
Info:
Info: Using the following PDB file for QM parameters: PolyAla-qm.pdb
Info: Number of QM atoms (excluding Dummy atoms): 20
Info: We found 2 QM-MM bonds.
Info: Applying user defined multiplicity 1 to QM group ID 1
Info: 1) Group ID: 1 ; Group size: 20 atoms ; Total charge: 0
Info: MM-QM pair: 24:30 -> Value (distance or ratio): 1.09 (QM Group 0 ID 1)
Info: MM-QM pair: 50:44 -> Value (distance or ratio): 1.09 (QM Group 0 ID 1)
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 2279 ATOMS
Info: 1546 BONDS
Info: 879 ANGLES
Info: 199 DIHEDRALS
Info: 15 IMPROPERS
Info: 6 CROSSTERMS
Info: 0 EXCLUSIONS
Info: 6837 DEGREES OF FREEDOM
Info: 773 HYDROGEN GROUPS
Info: 4 ATOMS IN LARGEST HYDROGEN GROUP
Info: 773 MIGRATION GROUPS
Info: 4 ATOMS IN LARGEST MIGRATION GROUP
Info: TOTAL MASS = 13773.9 amu
Info: TOTAL CHARGE = 2.98023e-08 e
Info: MASS DENSITY = 0.82848 g/cm^3
Info: ATOM DENSITY = 0.0825485 atoms/A^3
Info: *****************************
Info:
Info: Entering startup at 0.70037 s, 799.754 MB of memory in use
Info: Startup phase 0 took 0.00322795 s, 799.754 MB of memory in use
Info: The QM region will remove 19 bonds, 31 angles, 37 dihedrals, 3
impropers and 1 crossterms.
Info: ADDED 2624 IMPLICIT EXCLUSIONS
Info: Startup phase 1 took 0.709255 s, 799.887 MB of memory in use
Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
Info: NONBONDED TABLE SIZE: 769 POINTS
Info: INCONSISTENCY IN FAST TABLE ENERGY VS FORCE: 0.000325096 AT 11.9556
Info: INCONSISTENCY IN SCOR TABLE ENERGY VS FORCE: 0.000324844 AT 11.9556
Info: ABSOLUTE IMPRECISION IN VDWA TABLE ENERGY: 4.59334e-32 AT 11.9974
Info: RELATIVE IMPRECISION IN VDWA TABLE ENERGY: 7.4108e-17 AT 11.9974
Info: INCONSISTENCY IN VDWA TABLE ENERGY VS FORCE: 0.0040507 AT 0.251946
Info: ABSOLUTE IMPRECISION IN VDWB TABLE ENERGY: 1.53481e-26 AT 11.9974
Info: RELATIVE IMPRECISION IN VDWB TABLE ENERGY: 7.96691e-18 AT 11.9974
Info: INCONSISTENCY IN VDWB TABLE ENERGY VS FORCE: 0.00150189 AT 0.251946
Info: Startup phase 2 took 0.0194581 s, 804.121 MB of memory in use
Info: Startup phase 3 took 0.000361919 s, 804.121 MB of memory in use
Info: Startup phase 4 took 0.00718594 s, 804.121 MB of memory in use
Info: Startup phase 5 took 0.000344038 s, 804.121 MB of memory in use
Info: PATCH GRID IS 3 (PERIODIC) BY 4 (PERIODIC) BY 3 (PERIODIC)
Info: PATCH GRID IS 2-AWAY BY 2-AWAY BY 2-AWAY
Info: REMOVING COM VELOCITY -0.188499 0.149382 0.0208025
Info: LARGEST PATCH (17) HAS 78 ATOMS
Info: TORUS A SIZE 144 USING 0 36 72 108
Info: TORUS B SIZE 1 USING 0
Info: TORUS C SIZE 1 USING 0
Info: TORUS MINIMAL MESH SIZE IS 109 BY 1 BY 1
Info: Placed 100% of base nodes on same physical node as patch
Info: Startup phase 6 took 0.0212991 s, 805.082 MB of memory in use
Info: PME using 16 and 18 processors for FFT and reciprocal sum.
Info: PME GRID LOCATIONS: 7 15 23 31 43 51 59 67 79 87 ...
Info: PME TRANS LOCATIONS: 11 19 27 35 39 47 55 63 71 83 ...
Info: PME USING 16 GRID NODES AND 18 TRANS NODES
Info: Startup phase 7 took 0.113867 s, 805.75 MB of memory in use
Info: Startup phase 8 took 0.00489211 s, 805.75 MB of memory in use
LDB: Central LB being created...
Info: Startup phase 9 took 0.0102289 s, 805.75 MB of memory in use
Info: CREATING 2736 COMPUTE OBJECTS
Info: Startup phase 10 took 0.0117202 s, 805.75 MB of memory in use
Info: useSync: 1 useProxySync: 0
Info: Building spanning tree ... send: 1 recv: 0 with branch factor 4
Info: Startup phase 11 took 0.00923896 s, 805.75 MB of memory in use
Info: Startup phase 12 took 0.000352859 s, 805.75 MB of memory in use
Info: Finished startup at 1.6118 s, 805.75 MB of memory in use

TCL: Minimizing for 100 steps
Info: List of ranks running QM simulations: 2.
...............................

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2018 - 23:21:33 CST