From: Thomas C. Bishop (bishop_at_tulane.edu)
Date: Thu Jul 23 2009 - 14:31:10 CDT
Ok so I downloaded NAMD/FFT/TCl/CHARM and did the compile thing for our GPU
system (2x Tesla C1060 that do show up on namd output as binding)
The upshot: compiling as per instructions is a no brainer (if I can do it...)
The downside: CUDA runs slower than w/out (hmm.. maybe I shouldn't be one
doing this )
Below are benchmarks lines from the two systems.
the hardware is 16core/NODE w/ 2GPUs. I'm running on 1 NODE to avoid network
issues.
THE NAMD LOGS HAVE THESE MESSAGES:
Bad result from CmiGetPesOnPhysicalNode!
pe 10 physnode rank 0 of 1 is 0
AND
Charm warning> Randomization of stack pointer is turned on in Kernel,
run 'echo 0 > /proc/sys/kernel/randomize_va_space' as root to disable it.
Thread migration may not work!
Suggestions?
Tom
SYTEM 1: 134335 ATOMS protein-DNA complex
***************************
cuda.16.out:Info: Benchmark time: 16 CPUs 0.23318 s/step 1.34942 days/ns
76.3906 MB memory
cuda.16.out:Info: Benchmark time: 16 CPUs 0.233269 s/step 1.34994 days/ns
76.4316 MB memory
nocuda.16.out:Info: Benchmark time: 16 CPUs 0.213155 s/step 1.23353 days/ns
82.3376 MB memory
nocuda.16.out:Info: Benchmark time: 16 CPUs 0.213293 s/step 1.23433 days/ns
82.5941 MB memory
SYSTEM 2: the APO system ~100,000 atoms
*******************
apo.cuda.16.out:Info: Benchmark time: 16 CPUs 0.178929 s/step 2.07094 days/ns
55.6983 MB memory
apo.cuda.16.out:Info: Benchmark time: 16 CPUs 0.178775 s/step 2.06915 days/ns
55.7847 MB memory
apo.nocuda.16.out:Info: Benchmark time: 16 CPUs 0.171169 s/step 1.98112
days/ns 61.6952 MB memory
apo.nocuda.16.out:Info: Benchmark time: 16 CPUs 0.17114 s/step 1.98078 days/ns
61.3251 MB memory
-- ********************** Thomas C. Bishop * Office: 504-862-3370 * Fax: 504-862-8392 * **********************
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:53:04 CST