error while running namd on CRAY XC40 machine

From: Santosh Kumar Chaudhary (skc_at_physics.iisc.ernet.in)
Date: Sun Mar 22 2015 - 02:40:31 CDT

Dear All,

I have compiled NAMD 2.10 on CRAY XC40 machine using following steps -

./build charm++ gni-crayxc smp -j16 --with-production

./config --charm-base ./charm-6.6.1 --charm-arch CRAY-XC-intel
./config CRAY-XC-intel --charm-base ./charm-6.6.1 --charm-arch ./
gni-crayxc-smp --with-cuda --with-tcl --with-fftw3

I have also build charm with cuda, But after configuration when we run
make its terminating with error 1, so i removed cuda from build and
compiled .When I tried to run job on Nvidia Tesla K40 GPU Accelerator card
using script -

#!/bin/sh
#PBS -N jobname
#PBS -l select=1:ncpus=1:accelerator=True:accelerator_model="Tesla_K40s"
#PBS -l walltime=24:00:00
#PBS -e error.log
#PBS -l place=scatter
#PBS -S /bin/sh -V
#PBS -j oe
. /opt/modules/default/init/sh
cd $PBS_O_WORKDIR
cd /home/phd/11/physkc/software/NAMD_2.10_Source/CRAY-XC-intel
aprun -n 1 -N 1 ./namd2 /mnt/lustre/phy2/physkc/namd_tttk/jobname.conf >
jobname.out

I get an Error message. The output file is as follows -

Charm++> Running on Gemini (GNI) with 1 processes
Charm++> static SMSG
Charm++> memory pool init block size: 8MB, total memory pool limit 0MB (0
means no limit)
Charm++> memory pool registered memory limit: 200000MB, send limit: 100000MB
Charm++> only comm thread send/recv messages
Charm++> Cray TLB page size: 8192K
Charm++> Running in SMP mode: numNodes 1, 1 worker threads per process
Charm++> The comm. thread both sends and receives messages
Charm++> Using recursive bisection (scheme 3) for topology aware partitions
Converse/Charm++ Commit ID:
v6.6.1-rc1-1-gba7c3c3-namd-charm-6.6.1-build-2014-Dec-08-28969
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 unique compute nodes (24-way SMP).
Info: Built with CUDA version 5050
Did not find +devices i,j,k,... argument, using all
Pe 0 physical rank 0 binding to CUDA device 0 on physical node 0: 'Tesla
K40s' Mem: 11519MB Rev: 3.5
Info: NAMD 2.10 for CRAY-XC-smp-CUDA
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: for updates, documentation, and support information.
Info:
Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 60601 for gni-crayxc-smp
Info: Built Sat Mar 14 07:04:27 CDT 2015 by physkc on login2
Info: Running on 1 processors, 1 nodes, 1 physical nodes.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at 0.104503 s
Info: 10.7148 MB of memory in use based on /proc/self/stat
Info: Configuration file is
/mnt/lustre/phy2/physkc/namd_tttk/tk_ADP_TDP_gpu.conf
Info: Changed directory to /mnt/lustre/phy2/physkc/namd_tttk
TCL: Suspending until startup complete.
Info: EXTENDED SYSTEM FILE tk_ADP_TDP_water_eq3.xsc
Info: SIMULATION PARAMETERS:
Info: TIMESTEP 2
Info: NUMBER OF STEPS 0
Info: STEPS PER CYCLE 10
Info: PERIODIC CELL BASIS 1 91.6463 0 0
Info: PERIODIC CELL BASIS 2 0 90.0426 0
Info: PERIODIC CELL BASIS 3 0 0 83.8401
Info: PERIODIC CELL CENTER 0.147363 -0.141829 0.0225959
Info: WRAPPING WATERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: WRAPPING ALL CLUSTERS AROUND PERIODIC BOUNDARIES ON OUTPUT.
Info: LOAD BALANCER Centralized
Info: LOAD BALANCING STRATEGY New Load Balancers -- DEFAULT
Info: LDB PERIOD 2000 steps
Info: FIRST LDB TIMESTEP 50
Info: LAST LDB TIMESTEP -1
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: PME BACKGROUND SCALING 1
Info: MIN ATOMS PER PATCH 40
Info: VELOCITY FILE tk_ADP_TDP_water_eq3.rst.vel
Info: CENTER OF MASS MOVING INITIALLY? NO
Info: DIELECTRIC 1
Info: EXCLUDE SCALED ONE-FOUR
Info: 1-4 ELECTROSTATICS SCALED BY 0.833333
Info: MODIFIED 1-4 VDW PARAMETERS WILL BE USED
Info: DCD FILENAME tk_ADP_TDP_water_gpu.dcd
Info: DCD FREQUENCY 500
Info: DCD FIRST STEP 500
Info: DCD FILE WILL CONTAIN UNIT CELL DATA
Info: XST FILENAME tk_ADP_TDP_water_gpu.xst
Info: XST FREQUENCY 500
Info: VELOCITY DCD FILENAME tk_ADP_TDP_water_gpu.vdcd
Info: VELOCITY DCD FREQUENCY 1000
Info: VELOCITY DCD FIRST STEP 1000
Info: NO FORCE DCD OUTPUT
Info: OUTPUT FILENAME tk_ADP_TDP_water_gpu
Info: RESTART FILENAME tk_ADP_TDP_water_gpu.rst
Info: RESTART FREQUENCY 500
Info: BINARY RESTART FILES WILL BE USED
Info: SWITCHING ACTIVE
Info: SWITCHING ON 10
Info: SWITCHING OFF 12
Info: PAIRLIST DISTANCE 14
Info: PAIRLIST SHRINK RATE 0.01
Info: PAIRLIST GROW RATE 0.01
Info: PAIRLIST TRIGGER 0.3
Info: PAIRLISTS PER CYCLE 2
Info: PAIRLIST OUTPUT STEPS 1000
Info: PAIRLISTS ENABLED
Info: MARGIN 1
Info: HYDROGEN GROUP CUTOFF 2.5
Info: PATCH DIMENSION 17.5
Info: ENERGY OUTPUT STEPS 100
Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
Info: TIMING OUTPUT STEPS 1000
Info: PRESSURE OUTPUT STEPS 100
Info: LANGEVIN DYNAMICS ACTIVE
Info: LANGEVIN TEMPERATURE 338
Info: LANGEVIN USING BBK INTEGRATOR
Info: LANGEVIN DAMPING COEFFICIENT IS 5 INVERSE PS
Info: LANGEVIN DYNAMICS APPLIED TO HYDROGENS
Info: LANGEVIN PISTON PRESSURE CONTROL ACTIVE
Info: TARGET PRESSURE IS 1.01325 BAR
Info: OSCILLATION PERIOD IS 100 FS
Info: DECAY TIME IS 50 FS
Info: PISTON TEMPERATURE IS 338 K
Info: PRESSURE CONTROL IS GROUP-BASED
Info: INITIAL STRAIN RATE IS -4.17824e-05 -4.17824e-05 -4.17824e-05
Info: CELL FLUCTUATION IS ISOTROPIC
Info: PARTICLE MESH EWALD (PME) ACTIVE
Info: PME TOLERANCE 1e-06
Info: PME EWALD COEFFICIENT 0.257952
Info: PME INTERPOLATION ORDER 4
Info: PME GRID DIMENSIONS 125 125 125
Info: PME MAXIMUM GRID SPACING 1.5
Info: Attempting to read FFTW data from system
Info: Attempting to read FFTW data from
FFTW_NAMD_2.10_CRAY-XC-smp-CUDA_FFTW3.txt
Info: Optimizing 6 FFT steps. 1..._pmiu_daemon(SIGCHLD): [NID 00076]
[c0-0c1s3n0] [Sun Mar 22 02:43:34 2015] PE RANK 0 exit signal Illegal
instruction
Application 320566 exit codes: 132
Application 320566 resources: utime ~0s, stime ~0s, Rss ~14508, inblocks
~12413, outblocks ~28863

Please help me in solving this issue.
(I have also compiled the program without cuda which is running fine)

regards
Santosh Chaudhary

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:21:45 CST