Re: Running NAMD on Forge (CUDA)

From: Gianluca Interlandi (gianluca_at_u.washington.edu)
Date: Thu Jul 12 2012 - 14:21:38 CDT

Next message: Aron Broom: "Re: Running NAMD on Forge (CUDA)"
Previous message: Aron Broom: "Re: Running NAMD on Forge (CUDA)"
In reply to: Aron Broom: "Re: Running NAMD on Forge (CUDA)"
Next in thread: Aron Broom: "Re: Running NAMD on Forge (CUDA)"
Reply: Aron Broom: "Re: Running NAMD on Forge (CUDA)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

> are other people also using those GPUs?

I don't think so since I reserved the entire node.

> What are the benchmark timings that you are given after ~1000 steps?

The benchmark time with 6 processes is 101 sec for 1000 steps. This is
only slightly faster than Trestles where I get 109 sec for 1000 steps
running on 16 CPUs. So, yes 6 GPUs on Forge are much faster than 6 cores
on Trestles, but in terms of SUs it makes no difference, since on
Forge I still have to reserve the entire node (16 cores).

Gianluca

> is some setup time.
>
> I often run a system of ~100,000 atoms, and I generally see an order of magnitude
> improvement in speed compared to the same number of cores without the GPUs. I would
> test the non-CUDA precompiled cude on your Forge system and see how that compares, it
> might be the fault of something other than CUDA.
>
> ~Aron
>
> On Thu, Jul 12, 2012 at 2:41 PM, Gianluca Interlandi <gianluca_at_u.washington.edu>
> wrote:
> Hi Aron,
>
> Thanks for the explanations. I don't know whether I'm doing everything
> right. I don't see any speed advantage running on the CUDA cluster
> (Forge) versus running on a non-CUDA cluster.
>
> I did the following benchmarks on Forge (the system has 127,000 atoms and
> ran for 1000 steps):
>
> np 1: 506 sec
> np 2: 281 sec
> np 4: 163 sec
> np 6: 136 sec
> np 12: 218 sec
>
> On the other hand, running the same system on 16 cores of Trestles (AMD
> Magny Cours) takes 129 sec. It seems that I'm not really making good use
> of SUs by running on the CUDA cluster. Or, maybe I'm doing something
> wrong? I'm using the ibverbs-smp-CUDA pre-compiled version of NAMD 2.9.
>
> Thanks,
>
> Gianluca
>
> On Tue, 10 Jul 2012, Aron Broom wrote:
>
> if it is truly just one node, you can use the multicore-CUDA
> version and avoid the
> MPI charmrun stuff. Still, it boils down to much the same
> thing I think. If you do
> what you've done below, you are running one job with 12 CPU
> cores and all GPUs. If
> you don't specify the +devices, NAMD will automatically find
> the available GPUs, so I
> think the main benefit of specifying them is when you are
> running more than one job
> and don't want the jobs sharing GPUs.
>
> I'm not sure you'll see great scaling across 6 GPUs for a
> single job, but that would
> be great if you did.
>
> ~Aron
>
> On Tue, Jul 10, 2012 at 1:14 PM, Gianluca Interlandi
> <gianluca_at_u.washington.edu>
> wrote:
> Hi,
>
> I have a question concerning running NAMD on a CUDA
> cluster.
>
> NCSA Forge has for example 6 CUDA devices and 16 CPU
> cores per node. If I
> want to use all 6 CUDA devices in a node, how many
> processes is it
> recommended to spawn? Do I need to specify "+devices"?
>
> So, if for example I want to spawn 12 processes, do I
> need to specify:
>
> charmrun +p12 -machinefile $PBS_NODEFILE +devices
> 0,1,2,3,4,5 namd2
> +idlepoll
>
> Thanks,
>
> Gianluca
>
> -----------------------------------------------------
> Gianluca Interlandi, PhD gianluca_at_u.washington.edu
> +1 (206) 685 4435
>
> http://artemide.bioeng.washington.edu/
>
> Research Scientist at the Department of Bioengineering
> at the University of Washington, Seattle WA U.S.A.
> -----------------------------------------------------
>
>
>
>
> --
> Aron Broom M.Sc
> PhD Student
> Department of Chemistry
> University of Waterloo
>
>
>
>
> -----------------------------------------------------
> Gianluca Interlandi, PhD gianluca_at_u.washington.edu
> +1 (206) 685 4435
> http://artemide.bioeng.washington.edu/
>
> Research Scientist at the Department of Bioengineering
> at the University of Washington, Seattle WA U.S.A.
> -----------------------------------------------------
>
>
>
>
> --
> Aron Broom M.Sc
> PhD Student
> Department of Chemistry
> University of Waterloo
>
>
>

-----------------------------------------------------
Gianluca Interlandi, PhD gianluca_at_u.washington.edu
+1 (206) 685 4435
http://artemide.bioeng.washington.edu/

Research Scientist at the Department of Bioengineering
at the University of Washington, Seattle WA U.S.A.
-----------------------------------------------------

Next message: Aron Broom: "Re: Running NAMD on Forge (CUDA)"
Previous message: Aron Broom: "Re: Running NAMD on Forge (CUDA)"
In reply to: Aron Broom: "Re: Running NAMD on Forge (CUDA)"
Next in thread: Aron Broom: "Re: Running NAMD on Forge (CUDA)"
Reply: Aron Broom: "Re: Running NAMD on Forge (CUDA)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:46 CST