From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Wed Nov 23 2011 - 00:09:41 CST
ok, you need to check out what hardware you have: Is it infiniband or
10Gbit/s-Ethernet. If it is infiniband you can do what described. If it is
10Gbit/s-Ethernet, you can use the MyrinetExpress protocol over that to
minimize latency, but maybe 10Gbit with normal TCP is already enough for
you, what benchmarks will show. You could login to the nodes and do lspci do
see what hardware you have, then look for the network adapters. Infiniband
would be mostly something like Mellanox or Infinihost III. 10Gbit/s-Ethernet
would also be recognizable. You could also look to the connectors if you
have physical access to the nodes. 10Gbit/s-Ethernet has the known RJ-45
connector, while infiniband would have SFF-8470 or QSFP
Von: David Hemmendinger [mailto:hemmendd_at_union.edu]
Gesendet: Mittwoch, 23. November 2011 04:08
Betreff: Re: AW: AW: namd-l: namd-ibverbs fails to start
Hello, Norman and Axel,
Thanks for your messages. Any confusion is probably a
result of my lack of knowledge. From what I'd read, I thought that
RoCE could be described as using an Infiniband transport over an Ethernet
link layer. But perhaps that simply means that it uses RDMA, and so
my terminology is not quite correct.
Our OpenMPI installation can use both TCP and openib BTLs.
It sounds from both your messages that I should rebuild NAMD to
use MPI. However, if using IPoIB is also possible, I'd pursue that
also. We don't have an ib0 interface installed, as far as I can tell
-- it isn't reported by ifconfig, nor is there an ib0 in /sys/class/net.
But is this something that we can build, given our OFED installation?
Many thanks to you both!
>Well know i'm confused a little. ROCE (RDMA OVER CONVERGED ETHERNET) has
>nothing to do with infiniband as it is Ethernet (rj45), except both can use
>RemoteDirectMemoryAccess which is a kernel space bypass to increase
>performance. Infiniband and Ethernet are both only the specification for
>cable, connectors and transfer codes combinations (for dummys). What type
>traffic you want to sent over this physical connection is something
>completely else as this is a logical part. There are some possibilities
>ROCE or MXoE to use high performance logical protocols over generic
>hardware. But we want to talk about infiniband.
>Infiniband has actually no own logical transfer protocol, only some native
>verbs to speak with other nodes. You can use infiniband with those verbs
>(ibverbs) or use another logical protocol like IP as most applications
>OpenMPI support both, so its hard to determine which type of logical
>connection it uses. You need a OpenMPI installation with ibverbs installed
>(ompi_info) and grep for openib or something. If your mpi support ibverbs,
>you need to recompile your application mostly to build in the mpi commands
>as ibverbs translations. OR you just use IPoIB which is the most easiest.
>the OFED pakage is a IPoIB driver, the most application need this driver to
>resolve hostnames etc. Look if you have an interface like ib0 when making
>ifconfig. If you have, you can just run a normal namd (net or udp) over
>interfaces and you will use the infiniband as IPoIB without ibverbs.
>Another thing with ibverbs and rdma is the permissions on the infiniband
>device. Try if you can run the tests as root and as your user. If your user
>cant access the device, you can't run parallel jobs, and you don't want to
>run them as root. Check this please (/dev and udevrules) to make sure your
>permissions allow usage of the infiniband.
>In my opinion, as there's mostly no gain using ibverbs for a small number
>nodes, and even for cuda runs, use the IPoIB driver with the advices for
>connected mode and mtu and you will be fine.
>Mit freundlichen Gr??en
This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:00 CST