Re: 2.7b1 + CUDA + STMV

From: Peter Freddolino (petefred_at_ks.uiuc.edu)
Date: Tue Apr 21 2009 - 09:41:56 CDT

Hi Charles,
as is noted in the release notes and has been discussed before on this
list, the cuda version of namd does not yet properly handle 1-4
exclusions. It would not surprise me at all if this makes the virial
sufficiently wonky to cause a crash in this system. Using constant
volume or making the langevin piston period longer may work to
artificially make this stable, but please remember this doesn't help for
anything but benchmarks (or systems that don't need 1-4 exclusions)
until the cuda version is feature-complete.
Best,
Peter

Charles Taylor wrote:
>
> I've built several variants of NAMD 2.7b1. If I run the STMV
> benchmark case (downloaded from the NAMD web site) using generic
> multicore-linux64/Linux-x86_64 (charm/namd) executable on N processors,
> it seems to work as expected. However, if I try the same STMV
> benchmark case with *any* CUDA-enabled executable (single-processor,
> multicore, MPI), the simulation errors out with the following....
>
>
> Pe 0 has 2197 local and 0 remote patches and 59319 local and 0 remote
> computes.
> allocating 598 MB of memory on GPU
> CUDA EVENT TIMING: 0 6.988960 0.004640 0.004608 1034.456055 4.339712
> 1045.793945
> CUDA TIMING: 2264.392138 ms/step on node 0
> ETITLE: TS BOND ANGLE DIHED
> IMPRP ELECT VDW BOUNDARY MISC
> KINETIC TOTAL TEMP POTENTIAL
> TOTAL3 TEMPAVG PRESSURE GPRESSURE VOLUME
> PRESSAVG GPRESSAVG
>
> ENERGY: 0 354072.1600 280839.0161 81957.9556
> 4995.4407 -4503168.0834 384266.4616 0.0000
> 0.0000 947315.0098 -2449722.0396 297.9549 -3397037.0494
> -2377914.1292 297.9549 2686.8307 -19381.8928
> 10194598.5131 2686.8307 -19381.8928
>
> FATAL ERROR: Periodic cell has become too small for original patch grid!
> Possible solutions are to restart from a recent checkpoint,
> increase margin, or disable useFlexibleCell for liquid simulation.
> ------------- Processor 0 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: Periodic cell has become too small for original
> patch grid!
> Possible solutions are to restart from a recent checkpoint,
> increase margin, or disable useFlexibleCell for liquid simulation.
>
> I have tried increasing "margin" and "useFlexibleCell" is already set to
> "no".
>
> This may just be reflection of the maturity of the cuda-enabled code but
> I found references to CUDA-accelerated STMV runs
> (http://www.ks.uiuc.edu/Research/gpu/files/nvision2008compbio_stone.pdf)
> so I thought I'd ask if there is something special that needs to be done
> to get the STMV benchmark to work with CUDA support.
>
> Note that we are running Tesla 1070s w/ CUDA 2.1...
>
> Device 0: "Tesla T10 Processor"
> Major revision number: 1
> Minor revision number: 3
> Total amount of global memory: 4294705152 bytes
> Number of multiprocessors: 30
> Number of cores: 240
> Total amount of constant memory: 65536 bytes
> Total amount of shared memory per block: 16384 bytes
> Total number of registers available per block: 16384
> Warp size: 32
> Maximum number of threads per block: 512
> Maximum sizes of each dimension of a block: 512 x 512 x 64
> Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
> Maximum memory pitch: 262144 bytes
> Texture alignment: 256 bytes
> Clock rate: 1.30 GHz
> Concurrent copy and execution: Yes
>
>
> Charlie Taylor
> UF HPC Center
>
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:52:39 CST