From: Dow_Hurst (dhurst_at_mindspring.com)
Date: Wed May 13 2009 - 11:19:48 CDT
Axel is correct about the differences in Nvidia hardware. We just had a talk here at UNC-Greensboro by Dr. Lars Nyland, a senior architect in the compute group at Nvidia. He works in designing the computational circuitry that is on the Nvidia graphics cards. Nvidia provides this group a very small "acreage" on the chip area to add "computational" related circuits. It was a fascinating talk that brought out some points I wasn't aware of.
The newer hardware has what you want for computational routines to excel. Older hardware such as the 8000 series or older wouldn't have all the logic circuits that can speed up computations.
The design group get 5% of the graphics chip for computational related hardware so have to make good use of the "nanoacres"!
There is unique circuitry that implements an inverse square root since 80% of typical cpu time is spent on that type of operation. (The designers think in terms of how many ticks of a cpu or gpu clock does an operation require to complete)
The Tesla based C1060 and S1070 computational cards have ECC RAM instead of the the non-error correcting RAM found in the gaming cards. Long simulations might suffer from bit error, which doesn't matter in gaming routines but is critical in computational chemistry.
>From: Axel Kohlmeyer <akohlmey_at_cmm.chem.upenn.edu>
>Sent: May 12, 2009 7:24 AM
>To: Roman Petrenko <rpetrenko_at_gmail.com>
>Cc: NAMD list <namd-l_at_ks.uiuc.edu>
>Subject: Re: namd-l: namd-cuda-intel vs. namd-intel
>On Tue, 2009-05-12 at 01:32 -0400, Roman Petrenko wrote:
>> Dear developers,
>> we compared simulations of intel compiled namd2.7b1 programs with cuda
>> disabled and enabled options. NVT simulation of 30-residue peptide in
>> water box with PME and SMD was used. The observed speedup of namd with
>> GPU is just 4 times. Is it due incompleteness of cuda-namd project or
>> we did something wrong?
>the one thing that you did wrong for certain is not to provide any
>information about what hardware you are running on, what
>compilers/flags/libs you are using and most importantly access to your
>input, so that somebody can validate it. in general, it would be
>preferred to use one of the example inputs provided on the namd
>homepage, which people may already have some reference data for.
>there are CUDA capable GPUs out there, e.g. GeForce 8400 GS, that
>have very little speedup to offer compared to a GeForce GTX 285
>or a Tesla C1060.
>Axel Kohlmeyer akohlmey_at_cmm.chem.upenn.edu http://www.cmm.upenn.edu
> Center for Molecular Modeling -- University of Pennsylvania
>Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
>tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
>If you make something idiot-proof, the universe creates a better idiot.
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:52:47 CST