From: Amodeo, Pietro (pamodeo_at_icb.cnr.it)
Date: Fri Nov 26 2010 - 11:21:22 CST
Hello,
I'm trying to run a simulationwith NAMD 2.7 CUDA x86_64 on a system
including 120978 atoms (a large membrane protein complex, with lipids,
water and ions), on a 24-core workstation with two CUDA devices (1
Tesla C2050Â Mem: 2687MB, 1 GeForce GTX 480Â Mem: 1535MB).
Independently upon the number of cores (from 1 to 4 for 1GPU from 2
to 4 for 2GPUs) and/or GPUs used, as well as upon the setting for many
energy-related parameters (cutoffs, exclusions,...) the calculation
aborts with the following error message:
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: 10304 bytes of CUDA constant memory needed for
exclusions,
but only 8192 bytes available. Increase MAX_EXCLUSIONS.
------------- Processor 1 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error memcpy to exclusions: invalid
argument
Fatal error on PE 1> FATAL ERROR: CUDA error memcpy to exclusions:
invalid argument
While, obviously, the number of Processors varies according to the
number of used cores, the number of required and available bytes (this
latter, I guess, hardwired in the code) are invariant under any
CPU/GPU/input parameter setting I tried.
The same system runs flawlessly on the same system up to 24 cores
with the CPU-only version of NAMD 2.7.
At the end of the message I copied a representative output, obtained
with 2CPU-2GPU.
Obviously, I can provide any other information or execute tests that
can be useful for the resolution of the problem.
Thank you in advance for any help or suggestion.
Regards,
Pietro Amodeo
Dr. Pietro Amodeo
Istituto di Chimica Biomolecolare (ICB) del CNR
Comprensorio "A. Olivetti", Edificio 70
Via Campi Flegrei 34
I-80078 Pozzuoli (Napoli) - Italy
Email    pamodeo_at_icmib.na.cnr.it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Charm++: scheduler running in netpoll mode.
Charm++> Running on 1 unique compute nodes (24-way SMP).
Charm++> Cpu topology info:
PE to node map: 0 0
Node to PE map:
Chip #0: 0 1
Charm++> cpu topology info is gathered in 0.007 seconds.
Info: NAMD 2.7 for Linux-x86_64-CUDA
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: and send feedback or bug reports to namd_at_ks.uiuc.edu
Info:
Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 60202 for net-linux-x86_64-iccstatic
Info: Built Wed Oct 13 11:39:40 CDT 2010 by jim on
belfast.ks.uiuc.edu
Info: 1 NAMDÂ 2.7Â Linux-x86_64-CUDAÂ 2Â Â Â
ulisse.icmib.na.cnr.it piero
Info: Running on 2 processors.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at
0.0095129 s
Did not find +devices i,j,k,... argument, using all
Pe 0 physical rank 0 binding to CUDA device 0 on
ulisse.icmib.na.cnr.it: 'GeForc
e GTX 480'Â Mem: 1535MBÂ Rev: 2.0
Pe 1 physical rank 1 binding to CUDA device 1 on
ulisse.icmib.na.cnr.it: 'Tesla
C2050'Â Mem: 2687MBÂ Rev: 2.0
Info: 1.632 MB of memory in use based on CmiMemoryUsage
Info: Changed directory to /home/Mol/System_COMPLEX/testCUDA
Info: Configuration file is
System_COMPLEX_POPC_membr_nresPOPC_md5ps_npgt_unr.na
md
TCL: Suspending until startup complete.
Warning: The following variables were set in the
Warning: configuration file but were not needed
Warning:Â Â Â fixedAtomsForces
Warning:Â Â Â fixedAtomsFile
Warning:Â Â Â fixedAtomsCol
Warning:Â Â Â consref
Warning:Â Â Â conskfile
Warning:Â Â Â conskcol
Warning:Â Â Â constraintScaling
Warning:Â Â Â selectConstraints
Warning:Â Â Â selectConstrX
Warning:Â Â Â selectConstrY
Warning:Â Â Â selectConstrZ
Info: EXTENDED SYSTEM FILEÂ Â
/home/Mol/System_COMPLEX/System_COMPLEX_POPC_membr_
nresPOPC_md100ps_npt_unr.xsc
Warning: ALWAYS USE NON-ZERO MARGIN WITH CONSTANT PRESSURE!
Warning: CHANGING MARGIN FROM 0 to 0.81
Info: SIMULATION PARAMETERS:
Info: TIMESTEPÂ Â Â Â Â Â Â Â Â Â Â Â Â Â 1
Info: NUMBER OF STEPSÂ Â Â Â Â Â Â 0
Info: STEPS PER CYCLEÂ Â Â Â Â Â Â 20
Info: PERIODIC CELL BASIS 1Â 88.4383 0 0
Info: PERIODIC CELL BASIS 2Â 0 102.279 0
Info: PERIODIC CELL BASIS 3Â 0 0 130.301
Info: PERIODIC CELL CENTERÂ Â -2.268 -0.160501 7.716
Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: WRAPPING TO IMAGE NEAREST TO PERIODIC CELL CENTER.
Info: LOAD BALANCERÂ Centralized
Info: LOAD BALANCING STRATEGYÂ New Load Balancers -- ASB
Info: LDB PERIODÂ Â Â Â Â Â Â Â Â Â Â Â 4000 steps
Info: FIRST LDB TIMESTEPÂ Â Â Â 100
Info: LAST LDB TIMESTEPÂ Â Â Â -1
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: MAX SELF PARTITIONSÂ Â Â 1
Info: MAX PAIR PARTITIONSÂ Â Â 1
Info: SELF PARTITION ATOMSÂ Â 154
Info: SELF2 PARTITION ATOMSÂ Â 154
Info: PAIR PARTITION ATOMSÂ Â 318
Info: PAIR2 PARTITION ATOMSÂ 637
Info: MIN ATOMS PER PATCHÂ Â Â 100
Info: VELOCITY FILEÂ Â Â Â Â Â Â Â Â
/home/Mol/System_COMPLEX/System_COMPLEX_POPC_membr_
nresPOPC_md100ps_npt_unr.vel
Info: CENTER OF MASS MOVING INITIALLY? NO
Info: DIELECTRICÂ Â Â Â Â Â Â Â Â Â Â Â 1
Info: EXCLUDEÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â SCALED ONE-FOUR
Info: 1-4 ELECTROSTATICS SCALED BY 1
Info: MODIFIED 1-4 VDW PARAMETERS WILL BE USED
Info: DCD FILENAMEÂ Â Â Â Â Â Â Â Â Â
/home/Mol/System_COMPLEX/testCUDA/System_COMPLEX_PO
PC_membr_nresPOPC_md5ns_npgt_unr.dcd
Info: DCD FREQUENCYÂ Â Â Â Â Â Â Â Â 1000
Info: DCD FIRST STEPÂ Â Â Â Â Â Â Â 1000
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: XST FILENAMEÂ Â Â Â Â Â Â Â Â Â
/home/Mol/System_COMPLEX/testCUDA/System_COMPLEX_PO
PC_membr_nresPOPC_md5ns_npgt_unr.xst
Info: XST FREQUENCYÂ Â Â Â Â Â Â Â Â 1000
Info: NO VELOCITY DCD OUTPUT
Info: OUTPUT FILENAMEÂ Â Â Â Â Â Â
System_COMPLEX_POPC_membr_nresPOPC_md5ns_npgt_unr
Info: RESTART FILENAMEÂ Â Â Â Â Â
System_COMPLEX_POPC_membr_nresPOPC_md5ns_npgt_unr.r
estart
Info: RESTART FREQUENCYÂ Â Â Â Â 1000
Info: SWITCHING ACTIVE
Info: SWITCHING ONÂ Â Â Â Â Â Â Â Â Â 8
Info: SWITCHING OFFÂ Â Â Â Â Â Â Â Â 9
Info: PAIRLIST DISTANCEÂ Â Â Â Â 11
Info: PAIRLIST SHRINK RATEÂ Â 0.01
Info: PAIRLIST GROW RATEÂ Â Â Â 0.01
Info: PAIRLIST TRIGGERÂ Â Â Â Â Â 0.3
Info: PAIRLISTS PER CYCLEÂ Â Â 2
Info: PAIRLISTS ENABLED
Info: MARGINÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â 0.81
Info: HYDROGEN GROUP CUTOFFÂ 2.5
Info: PATCH DIMENSIONÂ Â Â Â Â Â Â 14.31
Info: ENERGY OUTPUT STEPSÂ Â Â 1000
Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
Info: TIMING OUTPUT STEPSÂ Â Â 1000
Info: LANGEVIN DYNAMICS ACTIVE
Info: LANGEVIN TEMPERATUREÂ Â 310
Info: LANGEVIN DAMPING COEFFICIENT IS 1 INVERSE PS
Info: LANGEVIN DYNAMICS NOT APPLIED TO HYDROGENS
Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
Info:Â Â Â Â Â Â Â TARGET PRESSURE IS 1.01325 BAR
Info:Â Â Â Â OSCILLATION PERIOD IS 1000 FS
Info:Â Â Â Â Â Â Â Â Â Â Â Â DECAY TIME IS 500 FS
Info:Â Â Â Â PISTON TEMPERATURE IS 310 K
Info:Â Â Â Â Â Â PRESSURE CONTROL IS GROUP-BASED
Info:Â Â Â INITIAL STRAIN RATE IS -1.11083e-06 -2.85184e-06
-3.99729e-06
Info:Â Â Â Â Â Â CELL FLUCTUATION IS ANISOTROPIC
Info: SURFACE TENSION CONTROL ACTIVE
Info:Â Â Â Â Â Â TARGET SURFACE TENSION IS 10 DYN/CM
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCEÂ Â Â Â Â Â Â Â Â Â Â Â Â Â 1e-06
Info: PME EWALD COEFFICIENTÂ Â Â Â Â Â 0.348832
Info: PME INTERPOLATION ORDERÂ Â Â Â 4
Info: PME GRID DIMENSIONSÂ Â Â Â Â Â Â Â 96 108 135
Info: PME MAXIMUM GRID SPACINGÂ Â Â 1.5
Info: Attempting to read FFTW data from
FFTW_NAMD_2.7_Linux-x86_64-CUDA.txt
Info: Optimizing 6 FFT steps. 1... 2... 3... 4... 5... 6... Â
Done.
Info: Writing FFTW data to FFTW_NAMD_2.7_Linux-x86_64-CUDA.txt
Info: FULL ELECTROSTATIC EVALUATION FREQUENCYÂ Â Â Â Â 4
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: RIGID BONDS TO HYDROGEN : ALL
Info:Â Â Â Â Â Â Â Â ERROR TOLERANCE : 1e-08
Info:Â Â Â Â Â Â Â Â Â MAX ITERATIONS : 100
Info: RIGID WATER USING SETTLE ALGORITHM
Info: NONBONDED FORCES EVALUATED EVERY 2 STEPS
Info: RANDOM NUMBER SEEDÂ Â Â Â 12345
Info: USE HYDROGEN BONDS?   NO
Info: COORDINATE PDBÂ Â Â Â Â Â Â Â
/home/Mol/System_COMPLEX/System_COMPLEX_POPC_membr_
nresPOPC_md100ps_npt_unr.coor
Info: STRUCTURE FILEÂ Â Â Â Â Â Â Â
/home/Mol/System_COMPLEX/System_COMPLEX_POPC_membr_
nresPOPC.psf
Info: PARAMETER file: CHARMM format!
Info: PARAMETERSÂ Â Â Â Â Â Â Â Â Â Â Â
/home/Mol/System_COMPLEX/par_all27_prot_lipid_na.in
p
Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
Warning: DUPLICATE ANGLE ENTRY FOR CPH1-NR1-CPH2
PREVIOUS VALUESÂ k=130Â theta0=107.5 k_ub=0 r_ub=0
  USING VALUES k=130 theta0=107 k_ub=0 r_ub=0
Info: SUMMARY OF PARAMETERS:
Info: 299 BONDS
Info: 729 ANGLES
Info: 1145 DIHEDRAL
Info: 84 IMPROPER
Info: 0 CROSSTERM
Info: 161 VDW
Info: 0 VDW_PAIRS
Info: TIME FOR READING PSF FILE: 1.99579
Info: TIME FOR READING PDB FILE: 0.193374
Info:
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 120977 ATOMS
Info: 94395 BONDS
Info: 104455 ANGLES
Info: 110798 DIHEDRALS
Info: 2962 IMPROPERS
Info: 0 CROSSTERMS
Info: 0 EXCLUSIONS
Info: 102969 RIGID BONDS
Info: 259962 DEGREES OF FREEDOM
Info: 44449 HYDROGEN GROUPS
Info: 4 ATOMS IN LARGEST HYDROGEN GROUP
Info: 44449 MIGRATION GROUPS
Info: 4 ATOMS IN LARGEST MIGRATION GROUP
Info: TOTAL MASS = 739544 amu
Info: TOTAL CHARGE = 3.46228e-05 e
Info: MASS DENSITY = 1.04195 g/cm^3
Info: ATOM DENSITY = 0.102643 atoms/A^3
Info: *****************************
Info:
Info: Entering startup at 27.2138 s, 35.1788 MB of memory in use
Info: Startup phase 0 took 8.29697e-05 s, 35.1795 MB of memory in use
Info: Startup phase 1 took 0.558542 s, 60.6464 MB of memory in use
Info: Startup phase 2 took 0.000669003 s, 61.5756 MB of memory in use
Info: PATCH GRID IS 6 (PERIODIC) BY 7 (PERIODIC) BY 9 (PERIODIC)
Info: PATCH GRID IS 1-AWAY BY 1-AWAY BY 1-AWAY
Info: REMOVING COM VELOCITY -0.025689 -0.0247133 0.000427051
Info: LARGEST PATCH (214) HAS 367 ATOMS
Info: Startup phase 3 took 0.223135 s, 80.8164 MB of memory in use
Info: PME using 2 and 2 processors for FFT and reciprocal sum.
Info: PME GRID LOCATIONS: 0 1
Info: PME TRANS LOCATIONS: 0 1
Info: Optimizing 4 FFT steps. 1... 2... 3... 4...  Done--_=_swift_v4_12907920824cefec9258a67_=_--
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:54:48 CST