Re: Technical specifications for the V100 and GTX gpu cards

From: Souvik Sinha (souvik.sinha893_at_gmail.com)
Date: Fri Mar 20 2020 - 01:18:32 CDT

Sorry for the late reply.

Your response is really helpful. I have checked the mailing list and trying
to get around a suitable combination, 'politically' as well as
'economically'.

Thanks.

On Tue, 17 Mar 2020, 17:03 Axel Kohlmeyer, <akohlmey_at_gmail.com> wrote:

>
>
> On Tue, Mar 17, 2020 at 6:11 AM Souvik Sinha <souvik.sinha893_at_gmail.com>
> wrote:
>
>> Hi,
>> I am confused to choose a GPU card related configuration compatible with
>> the NAMD application.
>>
>> Some server models (e.g. DELL) specifically require at least two CPUs
>> for V100 GP-GPUs to be functional. I wish to know whether such a technical
>> limitation is also valid if the GTX1080 card is used instead of V100? I
>> mean is it a general specification or specific to the build of a card.
>>
>
> it is more of a "political" choice. vendors, especially upper tier
> vendors, will not support consumer grade graphics hardware for computing
> purposes and outside of hardware designated for desktop use. if they would
> do so, they would be subject of sanctions from the graphics hardware vendor
> (e.g. loss of a "partner" status which translates into all kinds of
> competitive disadvantages). also vendors like nvidia have been
> systematically and deliberately limiting access to "enterprise features" in
> software and driver support for consumer grade hardware (and in some case
> also the other way around, but that is outside the scope of this
> discussion).
>
> there are two typical major differences between consumer grade hardware
> (GeForce) and computing grade hardware (Tesla). 1) Consumer grade GPU chips
> are often just a mid-level version of the GPU chips used in Tesla (or
> high-end Quadro) devices and are thus have much fewer double precision
> capable floating-point units. 2) consumer grade GPU cards often have less
> RAM on the GPU cards and do not use/support ECC style RAM. Also, enterprise
> level hardware is more narrowly selected and more thoroughly tested.
>
> neither of these will keep you from using consumer grade GPUs for NAMD
> since it employs GPU compute kernels that significantly utilize
> single-precision floating point math and because in most cases non-ECC RAM
> will work just fine (people using Tesla GPUs often with the ECC function
> disabled to boost performance and have a 20% increase in RAM), however,
> there is no way of telling and you run a higher risk of having subtly
> different results because of a weakness in some RAM cell resulting in
> occasional bit flips. The same applies to main memory as well, but since
> the memory is pushed more to the limit in high-end GPUs aimed at graphics
> for games, the probability is higher, and for games it doesn't matter much
> if a random single pixel has a slightly wrong color, which for the physics
> of a simulation it would matter much more if sometimes numbers change
> randomly.
>
> given the large difference in prices, the risk-reward assessment is rather
> difficult. consumer grade GPUs have very attractive pricing compared to
> enterprise grade GPUs and are providing a very high amount of computational
> power for simulations with NAMD. In recent years (and until the situation
> has become more competitive due to AMD's latest more competitive enterprise
> CPU generations), the situation for CPUs has been not so different,
> especially gold and platinum type xeon CPUs have become extremely expensive
> compared to their consumer grade counterparts.
>
> One more thing: if a workstation is planned with one Intel Xeon Gold 5218
>> and one GTX1080 or RTX2080, will it be fine to run NAMD?
>>
>
> you can run NAMD on pretty much any combination of nvidia GPUs and x86
> hardware. however, there are many factors impacting performance and risk of
> the hardware not being capable of handling continuous high computational
> load. for many people a high-end gaming machine can be just as potent and
> suitable as a significantly higher priced workstation with enterprise grade
> hardware. in many cases the choice of which hardware to pick is not so much
> governed by the price-performance numbers, but rather by the lack of
> staffing to handle the increased workload from using consumer grade
> hardware on a large scale. for a single machine, the difference is often
> negligible, but once you have to operate 10s or 100s of consumer grade
> machines for sustained computing, the cost of the hardware is of lesser
> importance (it still hurts to have to spend so much more for rather little
> gain on absolute performance), but the more hardware you have, the more the
> additional effort to manage consumer grade hardware for HPC use will
> manifest itself.
> This is not limited to GPUs, but applies to all kinds of computing
> hardware.
>
> HTH,
> Axel.
>
> p.s.: you should dig through the archives of this mailing list and you'll
> find many discussions about how to run NAMD efficiently on all kinds of
> GPU/CPU hardware choices. however, the choice of how high a risk you are
> willing to go is yours. as mentioned above, the rewards can be quite high,
> but you will then usually have to resolve technical issues yourself, if you
> operate hardware combinations not sanctioned/supported by a vendor.
>
>
>
>>
>> Thank you.
>>
>> --
>> Souvik Sinha
>> Research Fellow
>> Division of Bioinformatics
>> Bose Institute, Kolkata
>>
>> Contact: 033 25693275
>>
>
>
> --
> Dr. Axel Kohlmeyer akohlmey_at_gmail.com http://goo.gl/1wk0
> College of Science & Technology, Temple University, Philadelphia PA, USA
> International Centre for Theoretical Physics, Trieste. Italy.
>

This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:08 CST