My benchmark results with TeslaC2050

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Fri May 20 2011 - 01:02:36 CDT

Hi NAMD users,
I just want to share my benchmark results with you because of the many
discussions about performance of namd cuda and oversubscription of gpus. In
my test case it seems that an oversubscribing of a TeslaC2050 , plugged into
a PCIE2.0X16full-Slot, is possible with many processor cores because of the
good memory connection and bandwidth. The cuda code of namd seems to triple
the performance of the available processor cores with a TeslaC2050.
My hardware:
Fujitsu Celsius M470-2 Workstation
4GB DDR3-SD-RAM
1x Xeon E5645 6-Core 2.4 GHz 12MB HT
2x Nvidia Tesla C2050 3GB GDDR5
(Hyperthreading cores didn't made any difference for me so I don't show them
here)
 
My System: (amber -> mostly same performance as charm)
Collagen triplehelix with TIP3PBOX
Atoms: 278648 Bonds:278419 Residues:91737 Waters:90780 Box: 505 80 80
 
My results:
CPU only:
Cores 1 2 3 4
5 6
Time/step 4.5 2.3 1.6 1.2 0.9
0.8
CPU+GPU:
Cores 1 2 3 4
5 6 | 1 2 3
4 5 6
GPUS 1 1 1 1
1 1 | 2 2 2
2 2 2
Time/step 1.0 0.7 0.5 0.4 0.3
0.26 | - 0.6 0.5 0.4
0.3 0.26
 
NAMD scales very well here, deep respect to the coders ;)
As I interpret this results, because 6:1 is the same as 6:2 CPU:GPU ratio, I
say that there's still air for more cores per TeslaC2050, because there is
still no bottleneck to see here. Also, the available CPU-Power is at least
tripled by a TeslaC2050 as long as it is not running into a bottleneck,
which still doesn't happened here.
If someone has another view to that, please feel free to tell me.
Best regards
Norman Geist.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:57:09 CST