Re: Performance peoblem about NAMD-CUDA benchmarks

From: Biff Forbush (biff.forbush_at_yale.edu)
Date: Mon Apr 12 2010 - 14:42:37 CDT

Next message: Axel Kohlmeyer: "Re: Performance peoblem about NAMD-CUDA benchmarks"
Previous message: Ajasja Ljubetič: "calculate side-chain dihedral/torsion angles"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hi Liuxg,

The numbers are about 30% different for STMV. Things that could be
making the difference:

(A) Tesla vs gtx295, I can't comment. (B) you didn't mention your
CPU speed. (C). You may be better to turn hyperthreading off in the
bios. Offhand, I don't remember if this mattered with +p8, but my data
certainly show that going higher than +p8 with hyperthreading mostly
gives lower performance with 8 gpu cores (my fig. 8).

Did you see my benchmark details? (D) +setcpuaffinity helps some.
(E) outputenergies is higher than the number of steps (500) so the
benchmark is faster (frequent outputenergies slow down gpu benchmarks a
lot).

NBFIX errors: You can fix these by commenting out the NBFIX lines
in the xplor file and ApoA1 and ATPase will work, see my comments on this.

Regards,
Biff

On 4/11/2010 8:00 AM, xiaoguang liu wrote:
> Dear Prof. Biff Forbush
> I am so sorry that I missed a word of the motherboard's name,
> which is Tyan FT72 B7015.
>
> Liuxg
>
> 2010/4/11 xiaoguang liu <liuxguang_at_gmail.com <mailto:liuxguang_at_gmail.com>>
>
> Hi , Prof. Biff Forbush , (and anybody interested in it)
>
> After read your introduction about NAMD-CUDA benchmarks on
> NAMD mail
> list(http://www.ks.uiuc.edu/Research/namd/mailing_list/namd-l/11440.html),
> I am also doing some benchmarks on my GPU server this week.
>
> My server's hadware is :
> Motherboard FT72 B7015, which has 8 16XPCI-E slots and
> onboard VGA for console
> CPU: dual Intel Xeon 5520 , each has 4 cors ( 8 cores
> totally) and support Hyper-Threading ( So there are 16 cores to
> operation system's view)
> Memory: 8GB (DDR3 1333Mhz)
> Disk: 1.5TB
> GPU: eight Nvidia Tesla C1060 ( which has 240
> streaming processor cores, 4GB memory,using G200 core,which is
> same to your GTX295 )
> http://www.nvidia.com/object/product_tesla_c1060_us.html
> software:
> NAMD:Version 2.7b2 Linux-x86_64-CUDA
> <http://www.ks.uiuc.edu/Development/Download/download.cgi?UserID=&AccessCode=&ArchiveID=1009>
> CUDA : Version 2.3
> Nvidia driver: 195.36.15
> OS : Red Hat AS 5.3 kernel 2.6.18
>
>
> To compare with your results, I used the same benchmarks ,
> DHFR (23,558 atoms), er-gre (36,753), ApoA1(92,224 atoms),
> F1ATPase (32,7506 atoms), and STMV (106,6628 atoms) . They are all
> downliad from NAMD's website
> (http://www.ks.uiuc.edu/Research/namd/utilities/ and
> http://www.ks.uiuc.edu/Research/namd/tutorial/NCSA2001/performance.html)
> To all problems , I used 8 processes and all 8 GPUs, for example,
> ./charmrun ++local +p8 ./namd2 +idlepoll ../er-gre/er-gre.namd
> ++verbose
>
> When running Apoa1 and F1ATPase , it reported some error
> messgaes as following:
> Fatal error on PE 2> FATAL ERROR: CUDA-accelerated NAMD does
> not support NBFIerms in parameter file
>
> To the other three problems, the results are:
>
> Problem DHFR er-gre STMV
> Size 23,558 36,753
> 1,066,628
> s/step 0.0581146 s/step 0.105704 s/step
> 0.588051 s/step
> day/ns 0.336311 days/ns 1.22343 days/ns 6.80614
> days/ns
>
> These results are much poorer than yours.
> As my hardwares are almost same to yours , why the performance
> is poorer than yours?
> Do I lose some important things?
>
> Many thanks!
>
> Liuxg
>
>
>
>
>
>
>
>
>
>

Next message: Axel Kohlmeyer: "Re: Performance peoblem about NAMD-CUDA benchmarks"
Previous message: Ajasja Ljubetič: "calculate side-chain dihedral/torsion angles"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:55:40 CST