Re: Abe versus Lincoln

From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Thu Apr 07 2011 - 16:18:09 CDT

Next message: johan strumpfer: "Re: Abe versus Lincoln"
Previous message: Gianluca Interlandi: "Re: Abe versus Lincoln"
In reply to: Gianluca Interlandi: "Re: Abe versus Lincoln"
Next in thread: johan strumpfer: "Re: Abe versus Lincoln"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

On Thu, 2011-04-07 at 13:40 -0700, Gianluca Interlandi wrote:
> Hi Axel,
>
> Thanks, it makes it more clear. These are two typical system sizes I
> normally work with on Abe:
>
> Size [atoms] Time [hours/ns] # CPUs
>
> 127,000 9.7 32 CPUs
> 60,000 4.7 32 CPUs
>
> Do you think I would get a similar or better performance on Lincoln? How
> fast would these systems run on 2 or 4 Lincoln nodes?

i don't know. hours/ns is a bad measure to begin with,
since it depends on the length of your time step.

i've been doing a benchmark test with a 84000 atom
bulk water system on our local GPU machine (which has
4x C2050 and two quad core intel processors) and get
the following timings:

1 node ( 4 GPUs) 0.073 s/step
2 nodes ( 8 GPUs) 0.039 s/step
3 nodes (12 GPUs) 0.028 s/step
4 nodes (16 GPUs) 0.022 s/step
6 nodes (24 GPUs) 0.016 s/step
8 nodes (32 GPUs) 0.013 s/step

in comparison on 2x 6core intel westmere (similar to lonestar)

1 node ( 12 cores) 0.149 s/step
2 nodes ( 24 cores) 0.073 s/step
3 nodes ( 36 cores) 0.049 s/step
4 nodes ( 48 cores) 0.037 s/step
6 nodes ( 72 cores) 0.025 s/step
8 nodes ( 96 cores) 0.019 s/step
10 nodes (120 cores) 0.016 s/step
12 nodes (144 cores) 0.014 s/step
16 nodes (192 cores) 0.0114 s/step
20 nodes (240 cores) 0.0098 s/step
24 nodes (288 cores) 0.0076 s/step
32 nodes (384 cores) 0.00635 s/step

that won't give you the exact abe vs. lincoln comparison,
but rather the next generation hardware performance, with
the caveat, that the NAMD CUDA binaries were taken from the
NAMD home page and seem to have been compiled with cuda 2.x
and thus no specific fermi GPU optimizations.

in short, i can run 2x faster if i use the CPU,
but with 1-4 nodes i am about 2-3x faster per node

axel.

>
> Gianluca
>
> On Thu, 7 Apr 2011, Axel Kohlmeyer wrote:
>
> > On Thu, Apr 7, 2011 at 2:57 PM, Gianluca Interlandi
> > <gianluca_at_u.washington.edu> wrote:
> >> Hi Axel,
> >>
> >> Yes, I am aware of this retirement. This is part of the reason why I am
> >> posting this. I need to evaluate whether it is worth moving my remaining SUs
> >> to Lincoln or another CPU only cluster like ranger. Also, for the future, is
> >> it better to apply for computational time on a heterogenous (GPU+CPU)
> >> cluster like the new one which will be available at NCSA starting in July
> >> (Forge) or on a CPU-only cluster like ranger or kraken?
> >
> > that depends. GPU support will get better over time. but it is not predictable
> > how much and how soon. in part, it also depends on the hardware (GPU and host).
> > the biggest weakness of lincoln are the host nodes, TACC's longhorn has
> > essentially the same GPUs but they give better performance.
> >
> > in general, the current status is the following. if you want to get the most
> > out of your SUs (when using NAMD), then you want to run on GPUs, but
> > only a very small number of nodes and share the GPU between two CPU tasks.
> > so you get to run 3-4x faster _per SU_, but it will be overall slower.
> >
> > if you want to run faster, you are better off on kraken, lonestar and alike.
> > which one exactly would be best depends on the typical size of the job
> > and how far altogether you want to scale.
> >
> > does that make sense?
> >
> > axel.
> >
> >> Thanks,
> >>
> >> Gianluca
> >>
> >> On Thu, 7 Apr 2011, Axel Kohlmeyer wrote:
> >>
> >>> On Thu, Apr 7, 2011 at 2:40 PM, johan strumpfer <johanstr_at_ks.uiuc.edu>
> >>> wrote:
> >>>>
> >>>> Hi Gianluca,
> >>>>
> >>>> I've been running on both. Some benchmarks (system size ~120000 atoms):
> >>>> Using lincoln 32 CPU's (i.e. 4 nodes) I get ~0.034 s/step and using
> >>>> 160 CPU's on Abe I get ~0.017 s/step.
> >>>>
> >>>> Note that using CUDA NAMD, the scaling is good up to a point and then
> >>>> drops dramatically. E.g. running on 5 nodes on Lincoln also gets me
> >>>> approximate 0.034 s/step. So you'll be able to run faster on Abe, but
> >>>> it will use up many more SU's.
> >>>
> >>> FYI:
> >>>
> >>>
> >>> http://www.ncsa.illinois.edu/UserInfo/Resources/Hardware/Intel64Cluster/Retirement.html
> >>>
> >>> axel.
> >>>
> >>>
> >>>>
> >>>> Hope this helps,
> >>>> Johan
> >>>>
> >>>> On Thu, Apr 7, 2011 at 2:22 PM, Gianluca Interlandi
> >>>> <gianluca_at_u.washington.edu> wrote:
> >>>>>
> >>>>> Hi!
> >>>>>
> >>>>> Are there any benchmarks available which compare the performance of NAMD
> >>>>> on
> >>>>> Lincoln (CUDA heterogenous cluster) with its performance on a
> >>>>> traditional
> >>>>> CPU only cluster like Abe?
> >>>>>
> >>>>> Thanks!
> >>>>>
> >>>>> Gianluca
> >>>>>
> >>>>> -----------------------------------------------------
> >>>>> Gianluca Interlandi, PhD gianluca_at_u.washington.edu
> >>>>> +1 (206) 685 4435
> >>>>> http://artemide.bioeng.washington.edu/
> >>>>>
> >>>>> Postdoc at the Department of Bioengineering
> >>>>> at the University of Washington, Seattle WA U.S.A.
> >>>>> -----------------------------------------------------
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Dr. Axel Kohlmeyer
> >>> akohlmey_at_gmail.com http://goo.gl/1wk0
> >>>
> >>> Institute for Computational Molecular Science
> >>> Temple University, Philadelphia PA, USA.
> >>>
> >>
> >> -----------------------------------------------------
> >> Gianluca Interlandi, PhD gianluca_at_u.washington.edu
> >> +1 (206) 685 4435
> >> http://artemide.bioeng.washington.edu/
> >>
> >> Postdoc at the Department of Bioengineering
> >> at the University of Washington, Seattle WA U.S.A.
> >> -----------------------------------------------------
> >
> >
> >
> > --
> > Dr. Axel Kohlmeyer
> > akohlmey_at_gmail.com http://goo.gl/1wk0
> >
> > Institute for Computational Molecular Science
> > Temple University, Philadelphia PA, USA.
> >
>
> -----------------------------------------------------
> Gianluca Interlandi, PhD gianluca_at_u.washington.edu
> +1 (206) 685 4435
> http://artemide.bioeng.washington.edu/
>
> Postdoc at the Department of Bioengineering
> at the University of Washington, Seattle WA U.S.A.
> -----------------------------------------------------

-- 
Dr. Axel Kohlmeyer
akohlmey_at_gmail.com http://goo.gl/1wk0
Institute for Computational Molecular Science
Temple University, Philadelphia PA, USA.

Next message: johan strumpfer: "Re: Abe versus Lincoln"
Previous message: Gianluca Interlandi: "Re: Abe versus Lincoln"
In reply to: Gianluca Interlandi: "Re: Abe versus Lincoln"
Next in thread: johan strumpfer: "Re: Abe versus Lincoln"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:20:06 CST