Performance peoblem about NAMD-CUDA benchmarks

From: xiaoguang liu (
Date: Sun Apr 11 2010 - 06:53:16 CDT

Hi , Prof. Biff Forbush , (and anybody interested in it)

     After read your introduction about NAMD-CUDA benchmarks on NAMD mail
list(, I
am also doing some benchmarks on my GPU server this week.

     My server's hadware is :
          Motherboard FT72 B7015, which has 8 16XPCI-E slots and onboard VGA
for console
          CPU: dual Intel Xeon 5520 , each has 4 cors ( 8 cores totally)
and support Hyper-Threading ( So there are 16 cores to operation system's
          Memory: 8GB (DDR3 1333Mhz)
          Disk: 1.5TB
          GPU: eight Nvidia Tesla C1060 ( which has 240 streaming
processor cores, 4GB memory,using G200 core,which is same to your GTX295 )
          NAMD:Version 2.7b2
          CUDA : Version 2.3
          Nvidia driver: 195.36.15
          OS : Red Hat AS 5.3 kernel 2.6.18

    To compare with your results, I used the same benchmarks , DHFR (23,558
atoms), er-gre (36,753), ApoA1(92,224 atoms), F1ATPase (32,7506 atoms), and
STMV (106,6628 atoms) . They are all downliad from NAMD's website ( and
    To all problems , I used 8 processes and all 8 GPUs, for example,
    ./charmrun ++local +p8 ./namd2 +idlepoll ../er-gre/er-gre.namd ++verbose

    When running Apoa1 and F1ATPase , it reported some error messgaes as
    Fatal error on PE 2> FATAL ERROR: CUDA-accelerated NAMD does not support
NBFIerms in parameter file

     To the other three problems, the results are:

  Problem DHFR er-gre STMV
  Size 23,558 36,753
  s/step 0.0581146 s/step 0.105704 s/step 0.588051 s/step
  day/ns 0.336311 days/ns 1.22343 days/ns 6.80614 days/ns

    These results are much poorer than yours.
    As my hardwares are almost same to yours , why the performance is poorer
than yours?
    Do I lose some important things?

Many thanks!


This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:55:40 CST