[cluster-l] Single- vs. Dual- vs. Quad-core CPUs
Nils Oberg
noberg at uiuc.edu
Wed Mar 28 13:57:02 CDT 2007
Thanks for your help Jim.
I performed some benchmarks on demo equipment from AMD and Intel and
there are some interesting differences between the two platforms for
our code. All times are in seconds.
Here are some results:
2x2 Opteron 2218 2.6 GHz with 4GB RAM:
1 core: 697
2 cores: 323
4 cores: 211
4x2 Opteron 875 2.2 Ghz with 8 GB RAM:
1 core: 531
2 cores: 333
4 cores: 181
6 cores: 143
8 cores: 139
1x4 Xeon 5355 2.6 GHz with 4 GB RAM:
1 core: 510
2 cores: 343
4 cores: 251
2x4 Xeon 5355 2.6 GHz with 8 GB RAM:
1 core: 516
2 cores: 314
4 cores: 228
6 cores: 195
8 cores: 167
I don't understand why the Xeon performs better than the Opteron on
one core, but worse than the Opteron on 4 cores. I tried a different
CFD code and the same pattern emerged. Why might this be happening?
I was getting better than linear speedup results for one of our
programs. Is this possible? Here are some results:
2x4 Xeon 5355 2.6 GHz with 8 GB RAM:
cores: 1 7590
cores: 2 4523 speedup: 1.68
cores: 4 2060 speedup: 3.68
cores: 8 916 speedup: 8.29
2x2 Opteron 2218 2.6 GHz with 4GB RAM:
cores: 1 8497
cores: 2 4360 speedup: 1.95
cores: 4 1883 speedup: 4.51
Does this make sense?
Thanks for any help.
Nils
At 15:11 2/22/2007, Jim Phillips wrote:
>You really need to run some benchmarks. Failing that, look at the
>SPEC FP Rate results at
>http://www.spec.org/cpu2006/results/rfp2006.html There are three
>different CFD codes in the benchmark suite.
>
>1x4 2.7 GHz Xeon leslie3d = 15.0 total = 33.6
>2x4 2.7 GHz Xeon leslie3d = 21.9 total = 54.1
>2x2 3.0 GHz Xeon leslie3d = 25.8 total = 43.0
>2x2 2.6 GHz Optn leslie3d = 28.3 total = 38.1
>2x2 2.8 GHz Optn leslie3d = 36.3 total = 48.3 (PathScale compilers)
>
>So, the dual-socket, dual-core Opteron *may* be your best bet, if
>your workload is similar to leslie3d. Run some benchmarks.
>
>-Jim
>
>
>
>On Thu, 22 Feb 2007, Nils Oberg wrote:
>
>>Hi Jim,
>>
>>Thanks for your response. I should probably describe the
>>problem. Our application is a computation fluid dynamics (CFD)
>>code. My understanding of CFD codes is that they are primarily
>>memory bound. Since the domain to be modeled is broken up into
>>chunks, during the course of a time-step in the simulation a large
>>number of messages (not necessary large amounts of data) are passed
>>between processors.
>>
>>We're trying to decide between the following:
>>
>>uni-processor quad-core Xeon 4 GB RAM ($2,300 / node)
>>dual-processor quad-core Xeon 16 GB RAM ($5,800 / node)
>>dual-processor quad-core Xeon 8 GB RAM ($4,600 / node)
>>dual-processor dual-core Xeon 8 GB RAM ($3,800 / node)
>>dual-processor dual-core Opteron 8 GB RAM ($3,200 / node)
>
>--
>Nils Oberg, Research Programmer
>Civil & Environmental Engineering, University of Illinois at U-C
>phone: 217-333-8365, web: http://vtchl.uiuc.edu
More information about the cluster-l
mailing list