Re: Running NAMD on Forge (CUDA)

From: Aron Broom (broomsday_at_gmail.com)
Date: Thu Jul 12 2012 - 14:28:33 CDT

have you tried the multicore build? I wonder if the prebuilt smp one is
just not working for you.

On Thu, Jul 12, 2012 at 3:21 PM, Gianluca Interlandi <
gianluca_at_u.washington.edu> wrote:

> are other people also using those GPUs?
>>
>
> I don't think so since I reserved the entire node.
>
>
> What are the benchmark timings that you are given after ~1000 steps?
>>
>
> The benchmark time with 6 processes is 101 sec for 1000 steps. This is
> only slightly faster than Trestles where I get 109 sec for 1000 steps
> running on 16 CPUs. So, yes 6 GPUs on Forge are much faster than 6 cores on
> Trestles, but in terms of SUs it makes no difference, since on Forge I
> still have to reserve the entire node (16 cores).
>
> Gianluca
>
>
> is some setup time.
>>
>> I often run a system of ~100,000 atoms, and I generally see an order of
>> magnitude
>> improvement in speed compared to the same number of cores without the
>> GPUs. I would
>> test the non-CUDA precompiled cude on your Forge system and see how that
>> compares, it
>> might be the fault of something other than CUDA.
>>
>> ~Aron
>>
>> On Thu, Jul 12, 2012 at 2:41 PM, Gianluca Interlandi <
>> gianluca_at_u.washington.edu>
>> wrote:
>> Hi Aron,
>>
>> Thanks for the explanations. I don't know whether I'm doing
>> everything
>> right. I don't see any speed advantage running on the CUDA cluster
>> (Forge) versus running on a non-CUDA cluster.
>>
>> I did the following benchmarks on Forge (the system has 127,000
>> atoms and
>> ran for 1000 steps):
>>
>> np 1: 506 sec
>> np 2: 281 sec
>> np 4: 163 sec
>> np 6: 136 sec
>> np 12: 218 sec
>>
>> On the other hand, running the same system on 16 cores of Trestles
>> (AMD
>> Magny Cours) takes 129 sec. It seems that I'm not really making
>> good use
>> of SUs by running on the CUDA cluster. Or, maybe I'm doing something
>> wrong? I'm using the ibverbs-smp-CUDA pre-compiled version of NAMD
>> 2.9.
>>
>> Thanks,
>>
>> Gianluca
>>
>> On Tue, 10 Jul 2012, Aron Broom wrote:
>>
>> if it is truly just one node, you can use the multicore-CUDA
>> version and avoid the
>> MPI charmrun stuff. Still, it boils down to much the same
>> thing I think. If you do
>> what you've done below, you are running one job with 12 CPU
>> cores and all GPUs. If
>> you don't specify the +devices, NAMD will automatically find
>> the available GPUs, so I
>> think the main benefit of specifying them is when you are
>> running more than one job
>> and don't want the jobs sharing GPUs.
>>
>> I'm not sure you'll see great scaling across 6 GPUs for a
>> single job, but that would
>> be great if you did.
>>
>> ~Aron
>>
>> On Tue, Jul 10, 2012 at 1:14 PM, Gianluca Interlandi
>> <gianluca_at_u.washington.edu>
>> wrote:
>> Hi,
>>
>> I have a question concerning running NAMD on a CUDA
>> cluster.
>>
>> NCSA Forge has for example 6 CUDA devices and 16 CPU
>> cores per node. If I
>> want to use all 6 CUDA devices in a node, how many
>> processes is it
>> recommended to spawn? Do I need to specify "+devices"?
>>
>> So, if for example I want to spawn 12 processes, do I
>> need to specify:
>>
>> charmrun +p12 -machinefile $PBS_NODEFILE +devices
>> 0,1,2,3,4,5 namd2
>> +idlepoll
>>
>> Thanks,
>>
>> Gianluca
>>
>> ------------------------------**-----------------------
>> Gianluca Interlandi, PhD gianluca_at_u.washington.edu
>> +1 (206) 685 4435
>>
>> http://artemide.bioeng.**washington.edu/>
>>
>> Research Scientist at the Department of Bioengineering
>> at the University of Washington, Seattle WA U.S.A.
>> ------------------------------**-----------------------
>>
>>
>>
>>
>> --
>> Aron Broom M.Sc
>> PhD Student
>> Department of Chemistry
>> University of Waterloo
>>
>>
>>
>>
>> ------------------------------**-----------------------
>> Gianluca Interlandi, PhD
gianluca_at_u.washington.edu
>> +1 (206) 685 4435
>> http://artemide.bioeng.**washington.edu/>
>>
>> Research Scientist at the Department of Bioengineering
>> at the University of Washington, Seattle WA U.S.A.
>> ------------------------------**-----------------------
>>
>>
>>
>>
>> --
>> Aron Broom M.Sc
>> PhD Student
>> Department of Chemistry
>> University of Waterloo
>>
>>
>>
>>
> ------------------------------**-----------------------
> Gianluca Interlandi, PhD
gianluca_at_u.washington.edu
> +1 (206) 685 4435
> http://artemide.bioeng.**washington.edu/>
>
> Research Scientist at the Department of Bioengineering
> at the University of Washington, Seattle WA U.S.A.
> ------------------------------**-----------------------
>

-- 
Aron Broom M.Sc
PhD Student
Department of Chemistry
University of Waterloo

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:46 CST