[cluster-l] Single- vs. Dual- vs. Quad-core CPUs

Jim Phillips jim at ks.uiuc.edu
Thu Feb 22 14:11:14 CST 2007


You really need to run some benchmarks.  Failing that, look at the SPEC FP 
Rate results at http://www.spec.org/cpu2006/results/rfp2006.html  There 
are three different CFD codes in the benchmark suite.

1x4 2.7 GHz Xeon  leslie3d = 15.0   total = 33.6
2x4 2.7 GHz Xeon  leslie3d = 21.9   total = 54.1
2x2 3.0 GHz Xeon  leslie3d = 25.8   total = 43.0
2x2 2.6 GHz Optn  leslie3d = 28.3   total = 38.1
2x2 2.8 GHz Optn  leslie3d = 36.3   total = 48.3  (PathScale compilers)

So, the dual-socket, dual-core Opteron *may* be your best bet, if your 
workload is similar to leslie3d.  Run some benchmarks.

-Jim



On Thu, 22 Feb 2007, Nils Oberg wrote:

> Hi Jim,
>
> Thanks for your response.  I should probably describe the problem.  Our 
> application is a computation fluid dynamics (CFD) code.  My understanding of 
> CFD codes is that they are primarily memory bound.  Since the domain to be 
> modeled is broken up into chunks, during the course of a time-step in the 
> simulation a large number of messages (not necessary large amounts of data) 
> are passed between processors.
>
> We're trying to decide between the following:
>
> uni-processor quad-core Xeon 4 GB RAM ($2,300 / node)
> dual-processor quad-core Xeon 16 GB RAM ($5,800 / node)
> dual-processor quad-core Xeon 8 GB RAM ($4,600 / node)
> dual-processor dual-core Xeon 8 GB RAM ($3,800 / node)
> dual-processor dual-core Opteron 8 GB RAM ($3,200 / node)
>
>
> At 16:12 1/22/2007, Jim Phillips wrote:
>> Are you limited by memory bandwidth or clock speed?  All of those cores 
>> share the same memory bandwidth, but the clock speed is almost the same.
>
> I really don't know which is a limiting factor.  I'm guessing it is memory 
> latency (is that clock speed?) more than anything.
>
>> Is memory an issue?  How much memory do you need per node and per core? If 
>> you can use shared-memory within a node then you can add cores without 
>> adding extra memory.  Otherwise you may need to use larger memory chips to 
>> fit enough memory into the node.  Quad-core is more flexible if you only 
>> need more memory on occasion since you can drop down to one core per node.
>
> The programs currently don't use shared memory.  I think would be fairly 
> difficult to recode the software for shared memory, as it is using external 
> libraries (Petsc, Parmetis) that rely on MPI.
>
> The newer Intel CPUs (Xeon 5000 series) require fully buffered RAM.  I've 
> read that FB RAM has both higher latency and lower bandwidth.  Does this mean 
> that, respectively, requests from the CPU take longer, and the amount of data 
> transferred is smaller?
>
> A second question: The memory clock speed should be 50% of the CPU front-side 
> bus speed, correct?  In other words, I shouldn't get 533 Mhz memory with a 
> 1333 Mhz FSB?
>
> Thanks for your help!
>
> Nils
>
>
>> On Fri, 19 Jan 2007, Nils Oberg wrote:
>> 
>>> Hello,
>>> 
>>> Our group is going to purchase a small cluster.  I'm trying to decide
>>> if each node in the cluster should have dual- or quad-core
>>> CPUs.  Does anyone have any advice for how to benchmark?  Or other
>>> resources that might help me get started?
>>> 
>>> As an FYI, I noticed that NCSA is building a new cluster
>>> (http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/Intel64Cluster/)
>>> that has dual-socket quad-core compute nodes for a total of 8 cores.
>>> 
>>> Nils
>>> 
>>> 
>>> 
>>> --
>>> Nils Oberg, Research Programmer
>>> Civil & Environmental Engineering, University of Illinois at U-C
>>> phone: 217-333-8365, web: http://vtchl.uiuc.edu
>>> 
>>> _______________________________________________
>>> cluster-l mailing list
>>> cluster-l at ks.uiuc.edu
>>> http://www.ks.uiuc.edu/mailman/listinfo/cluster-l
>> 
>> --
>> Nils Oberg, Research Programmer
>> Civil & Environmental Engineering, University of Illinois at U-C
>> phone: 217-333-8365, web: http://vtchl.uiuc.edu
>


More information about the cluster-l mailing list