[cluster-l] Single- vs. Dual- vs. Quad-core CPUs

Nils Oberg noberg at uiuc.edu
Wed Mar 28 13:57:02 CDT 2007


Thanks for your help Jim.

I performed some benchmarks on demo equipment from AMD and Intel and 
there are some interesting differences between the two platforms for 
our code.  All times are in seconds.

Here are some results:

2x2 Opteron 2218 2.6 GHz with 4GB RAM:
1 core:  697
2 cores: 323
4 cores: 211

4x2 Opteron 875 2.2 Ghz with 8 GB RAM:
1 core: 531
2 cores: 333
4 cores: 181
6 cores: 143
8 cores: 139

1x4 Xeon 5355 2.6 GHz with 4 GB RAM:
1 core:  510
2 cores: 343
4 cores: 251

2x4 Xeon 5355 2.6 GHz with 8 GB RAM:
1 core:  516
2 cores: 314
4 cores: 228
6 cores: 195
8 cores: 167


I don't understand why the Xeon performs better than the Opteron on 
one core, but worse than the Opteron on 4 cores.  I tried a different 
CFD code and the same pattern emerged.  Why might this be happening?


I was getting better than linear speedup results for one of our 
programs.  Is this possible?  Here are some results:

2x4 Xeon 5355 2.6 GHz with 8 GB RAM:
cores: 1  7590
cores: 2  4523   speedup: 1.68
cores: 4  2060   speedup: 3.68
cores: 8   916   speedup: 8.29

2x2 Opteron 2218 2.6 GHz with 4GB RAM:
cores: 1  8497
cores: 2  4360   speedup: 1.95
cores: 4  1883   speedup: 4.51


Does this make sense?

Thanks for any help.

Nils



At 15:11 2/22/2007, Jim Phillips wrote:

>You really need to run some benchmarks.  Failing that, look at the 
>SPEC FP Rate results at 
>http://www.spec.org/cpu2006/results/rfp2006.html  There are three 
>different CFD codes in the benchmark suite.
>
>1x4 2.7 GHz Xeon  leslie3d = 15.0   total = 33.6
>2x4 2.7 GHz Xeon  leslie3d = 21.9   total = 54.1
>2x2 3.0 GHz Xeon  leslie3d = 25.8   total = 43.0
>2x2 2.6 GHz Optn  leslie3d = 28.3   total = 38.1
>2x2 2.8 GHz Optn  leslie3d = 36.3   total = 48.3  (PathScale compilers)
>
>So, the dual-socket, dual-core Opteron *may* be your best bet, if 
>your workload is similar to leslie3d.  Run some benchmarks.
>
>-Jim
>
>
>
>On Thu, 22 Feb 2007, Nils Oberg wrote:
>
>>Hi Jim,
>>
>>Thanks for your response.  I should probably describe the 
>>problem.  Our application is a computation fluid dynamics (CFD) 
>>code.  My understanding of CFD codes is that they are primarily 
>>memory bound.  Since the domain to be modeled is broken up into 
>>chunks, during the course of a time-step in the simulation a large 
>>number of messages (not necessary large amounts of data) are passed 
>>between processors.
>>
>>We're trying to decide between the following:
>>
>>uni-processor quad-core Xeon 4 GB RAM ($2,300 / node)
>>dual-processor quad-core Xeon 16 GB RAM ($5,800 / node)
>>dual-processor quad-core Xeon 8 GB RAM ($4,600 / node)
>>dual-processor dual-core Xeon 8 GB RAM ($3,800 / node)
>>dual-processor dual-core Opteron 8 GB RAM ($3,200 / node)
>
>--
>Nils Oberg, Research Programmer
>Civil & Environmental Engineering, University of Illinois at U-C
>phone: 217-333-8365, web: http://vtchl.uiuc.edu



More information about the cluster-l mailing list