Re: Three GPU cards on shared-mem motherboard

From: Francesco Pietra (
Date: Tue May 29 2012 - 16:59:32 CDT

To the extent that a reader may be interested here in consumer mainboards:

After much looking around, it came out that consumer mainboards are
limited to two real x16 2.0. The 990FXA-GD80, with declared four x16
2.0, is the only exception I was able to find, albeit a suspicious

Thus, I am inclined to stick at the GA-890FXA-UD5 I have, which
performs quite well in MD/CUDA with two GTX-580 (the new Gigabyte that
replaces this one is still at two x16 2.0). What I would like to do is
replacing the two GTX-580 with faster cards. I can find a good
arrangement for that. Unease decision, however, unless someone comes
out here with classical molecular dynamics benchmarks for recent GPU

>From tests for gaming, most often carried out on OpenCL rather than
CUDA, the Radeon HD 7970 wins over GTX-580 by a factor of two, and
even more on GTX-680 in LuxMark's OpenCL-driven ray-tracing test. In
other game tests the difference is modest:

Even the very expensive GTX-690 is outperformed by Radeon HD 7970 in
LuxMark's OpenCL-driven ray-tracing test:

What would be needed at this point is a benchmark for Radeon HD 7970
with CUDA/MD.

At any event, whether the memory bandwidth of my GA-890FXA-UD5 is
enough for two HD 7970, or an Intel socket LGA2011 board is needed
with Core i7-3930K or i7-3960X (6 physical CPUs) and four memory
controllers instead of two for AMD, is another issue that I am also
unable to take.

I would be very grateful for comments on these points. Doubling the
speed of the simulation (as it occurred when I replaced the GTX-470
with GTX-580) is worth the money.

francesco pietra

On Mon, May 28, 2012 at 6:20 PM, Axel Kohlmeyer <> wrote:
> On Mon, May 28, 2012 at 12:03 PM, Francesco Pietra
> <> wrote:
>>> When referring to NAMD, I wanted to imply (badly, I admit) performance
>>> boost by the third GPU.
> as i was mentioning before. that is near impossible to predict.
>>> The PCI specification is described by the manufacturer as follows
>>> -- PCI Express slots version: 2.0.
>>> -- PCI slots: 1.
>>> -- PCI express x1 slots: 1.
>>> -- PCI express x16 slots: 4.
> that doesn't mean anything. labeling slots as x16
> only means that you can stick an x16 wide card
> into it. each of these slots can be wired with 16,
> 8, 4, 2 or 1 lane. also, some boards claim they
> have all 16-lane slots, but then two slots are
> connected to a little bridge chip. resulting in
> two cards each having to share the bandwidth.
>>> Whether these are real x16 2.x, or not, is beyond my understanding. I
> with out that information, you can't judge.
> contact the vendor or find somebody that
> has time to research it.
>>> can only compare with the corresponding description for the mainboard
>>> I am currently using: GA-890FXA-UD5:
>>> 2 x PCI Express x16, running at x16 (PCIEX16_1, PCIEX16_2).
>>> 1 x PCI Express x16 slot, running at x8 (PCIEX8).
>>> 1 x PCI Express x16 slot, running at x4 (PCIEX4).
>>> 2 x PCI Express x1 slots.
>>>  (All PCI Express slots conform to the PCI Express 2.0)
>>> 1 x PCI slot.
>>> With this latter mainboard, adding a second GTX-580 gave the expected
>>> acceleration. Data for PCIs of the two mainboards being comparable, I
>>> would expect that a third GTX-580 on the 990.. motherboard should play
>>> well its job. Is it this naive extrapolaion a sound one?
> no. you usually overload the memory bandwidth of the CPU
> with the third GPU and thus you won't get the full speedup.
> how much speedup you'll get depends on the individual
> characteristics of your input.
> axel.
>>> Thanks indeed for further advice
>>> francesco pietra
> --
> Dr. Axel Kohlmeyer
> College of Science and Technology
> Temple University, Philadelphia PA, USA.

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:34 CST