[cluster-l] Single- vs. Dual- vs. Quad-core CPUs

Jay A. Kreibich jak at uiuc.edu
Sun Jan 21 16:06:07 CST 2007


On Fri, Jan 19, 2007 at 05:28:30PM -0600, Nils Oberg scratched on the wall:
> Hello,
> 
> Our group is going to purchase a small cluster.  I'm trying to decide 
> if each node in the cluster should have dual- or quad-core 
> CPUs.  Does anyone have any advice for how to benchmark?  Or other 
> resources that might help me get started?

  I've looked at a great number of case studies and the usual answer is
  to buy the most powerful systems you can, even if the
  cost/performance ratio on the purchase price is starting to slip.  The
  reason for this is that purchase costs isn't that big of a deal in
  the bigger picture, and a great number Total-Cost-of-Ownership costs are
  "per-node", regardless of the size of complexity of each individual
  node.

  Examples of "per-node" costs include: disk, memory, OS licenses
  (if applicable), most software licenses (if applicable),
  warranty/repair costs, network ports (in the interconnect fabric),
  power ports, OS admin time/image time, etc.  Even power and cooling
  costs (which can be huge) are more heavily influenced by the total
  number of nodes over the total cores.  Adding cores does add power
  and heat, but not as much as adding whole new nodes. 
  
  Chances of a failure also go up exponentially as you add nodes.  On
  the other hand, the percentage "hit" of loosing a node decreases as
  you add nodes, although the costs and trouble of re-starting or
  re-factoring jobs often makes this second point meaningless.

  Some of these might not apply-- OS license cost for Linux systems
  using free distributions for example.  Others might not be fully
  obvious.  Memory usage, for example, is a big one.  Memory is still
  the most expensive part of any decent computer, even today.  If you
  have problems where every thread needs a full copy of the 6GB data
  set (but can share that set) then more cores can be a huge win by
  adjusting your core to memory costs.  If, on the other hand, each
  thread needs a private 4GB working data set, you're going to break
  the bank with huge memory densities in order to achieve usage on
  large(r) multi-core systems, so you might as well stay small(er).
  Disk can be similar.
  
  Failure rate isn't *as* big of a deal if the cluster normally runs
  a large number of small jobs (and not one or two jobs that use
  all/half the nodes), but most people grossly underestimate how
  much pain (e.g. time costs) tracking down and fixing bad nodes
  requires or how much your users will scream if the cluster hosts
  long-running jobs (which they usually do).  Good software allows
  checkpointing, but few systems allow re-starts on a different number
  of nodes.

  All this said, if the concept of "buy the most powerful system you can"
  is taken to its utmost extreme, each cluster node would be some
  monstrous three million dollar machine.  There are balances to be
  found.  Most of those balances are based off the idea of buying what
  you need and not buying what you don't.
  
  While eight-core systems might look great on paper, most of the cost
  in traditional SMP multi-CPU Sun servers (or similar systems from
  whatever your favorite vendor is) is in I/O systems, and especially
  memory.  Most clusters are used to run heavy number crunching, and
  that traditionally requires very large memory bandwidths.  My guess
  is that most quad core systems (never mind 8x) can easily stall out
  the memory controller of most modern servers you might be considering
  for clusters.  Eight cores isn't going to do you a lot of good if
  they spend over half their time waiting for memory requests.  You'd
  do better with a larger number of lower-core systems.  Similar things
  can be said about interconnect fabrics-- you save on ports with a
  smaller number of larger core nodes, but if you're blocking on
  message traffic, you might do better to have more nodes with fewer
  cores.  Memory would be my main concern, however.


  The answers to many of these questions depends on what you're going
  to do with the cluster.  Many smaller clusters are purpose-built,
  with one program or suite of applications in mind.  This makes it
  easier to test systems out with the actual production software and
  make very specific and targeted choices.  Most large clusters, on the
  other hand, have something specific in mind, but their funding
  requires they be fairly general purpose.  In that case, it is often
  easier to lean to "more powerful" (in terms of CPU cycles) because it
  looks good on paper and wins benchmarks.  There are also some cases
  when the extra power *can* be utilized, and many cases in which you just
  turn it off (and only run two threads per node regardless of cores).

  On other hand, even propose built clusters morph in usage and
  software over time.  On the gripping hand, a cluster generally has a
  practical life-span of only about three years, making it fairly easy
  to revise and make adjustments in the hardware. 
  
  So the final answer is, as always, "it depends."  My first reaction is
  "buy the biggest" but when you're talking about dual-chip, quad-core
  systems, it seems very unlikely that such systems will have enough
  front-side memory bandwidth, never mind the main memory system, for
  anything interesting.  Chip manufactures have been cranking out cores
  a lot faster than bridge chips have addressed all the other problems
  (or even the front-side chip pin-outs, for that matter).  There's a
  reason eight CPU Sun systems cost what they do.


   -j

-- 
Jay A. Kreibich < J A Y  @  K R E I B I.C H >

"'People who live in bamboo houses should not throw pandas.' Jesus said that."
   - "The Ninja", www.AskANinja.com, "Special Delivery 10: Pop!Tech 2006"


More information about the cluster-l mailing list