[cluster-l] Single- vs. Dual- vs. Quad-core CPUs
Nils Oberg
noberg at uiuc.edu
Thu Mar 29 15:12:53 CDT 2007
At 14:30 3/28/2007, Jim Phillips wrote:
>Everything you're seeing makes sense for a memory-limited code.
Will a memory-limited code benefit significantly by adding a
higher-speed interconnect? Do vendors lend demo equipment to test
these sorts of things?
>Each Opteron core has its own cache, and each chip has its own
>memory interface. Each pair of Xeon cores shares a single cache,
>and all chips share a single memory interface. Thus the Opteron
>system scales more linearly to higher numbers of cores. Of course,
>when you use fewer cores it also slows down linearly, as IBM pointed
>out a few years ago. It's a trade-off, and all that really matters
>is the maximum performance you can get when using all of the cores on a node.
Is there any way to determine how jobs are scheduled? I'm thinking
of the case where a user runs an mpich job with 10 processors on a
5-node cluster with 8 cores in each node. Will mpich put 2 processes
on each node, or will it bunch them all on the first two nodes in its
machines file?
Thanks,
Nils
>On Wed, 28 Mar 2007, Nils Oberg wrote:
>
>>Thanks for your help Jim.
>>
>>I performed some benchmarks on demo equipment from AMD and Intel
>>and there are some interesting differences between the two
>>platforms for our code. All times are in seconds.
>>
>>Here are some results:
>>
>>2x2 Opteron 2218 2.6 GHz with 4GB RAM:
>>1 core: 697
>>2 cores: 323
>>4 cores: 211
>>
>>4x2 Opteron 875 2.2 Ghz with 8 GB RAM:
>>1 core: 531
>>2 cores: 333
>>4 cores: 181
>>6 cores: 143
>>8 cores: 139
>>
>>1x4 Xeon 5355 2.6 GHz with 4 GB RAM:
>>1 core: 510
>>2 cores: 343
>>4 cores: 251
>>
>>2x4 Xeon 5355 2.6 GHz with 8 GB RAM:
>>1 core: 516
>>2 cores: 314
>>4 cores: 228
>>6 cores: 195
>>8 cores: 167
>>
>>
>>I don't understand why the Xeon performs better than the Opteron on
>>one core, but worse than the Opteron on 4 cores. I tried a
>>different CFD code and the same pattern emerged. Why might this be happening?
>>
>>
>>I was getting better than linear speedup results for one of our
>>programs. Is this possible? Here are some results:
>>
>>2x4 Xeon 5355 2.6 GHz with 8 GB RAM:
>>cores: 1 7590
>>cores: 2 4523 speedup: 1.68
>>cores: 4 2060 speedup: 3.68
>>cores: 8 916 speedup: 8.29
>>
>>2x2 Opteron 2218 2.6 GHz with 4GB RAM:
>>cores: 1 8497
>>cores: 2 4360 speedup: 1.95
>>cores: 4 1883 speedup: 4.51
>>
>>
>>Does this make sense?
>>
>>Thanks for any help.
>>
>>Nils
>
>--
>Nils Oberg, Research Programmer
>Civil & Environmental Engineering, University of Illinois at U-C
>phone: 217-333-8365, web: http://vtchl.uiuc.edu
More information about the cluster-l
mailing list