AW: AW: namd-ibverbs fails to start

From: Norman Geist (
Date: Tue Nov 22 2011 - 00:47:00 CST

Hi, (you can also greet, we want to help)

Well know i'm confused a little. ROCE (RDMA OVER CONVERGED ETHERNET) has
nothing to do with infiniband as it is Ethernet (rj45), except both can use
RemoteDirectMemoryAccess which is a kernel space bypass to increase
performance. Infiniband and Ethernet are both only the specification for
cable, connectors and transfer codes combinations (for dummys). What type of
traffic you want to sent over this physical connection is something
completely else as this is a logical part. There are some possibilities like
ROCE or MXoE to use high performance logical protocols over generic Ethernet
hardware. But we want to talk about infiniband.
Infiniband has actually no own logical transfer protocol, only some native
verbs to speak with other nodes. You can use infiniband with those verbs
(ibverbs) or use another logical protocol like IP as most applications
support it.
OpenMPI support both, so its hard to determine which type of logical
connection it uses. You need a OpenMPI installation with ibverbs installed
(ompi_info) and grep for openib or something. If your mpi support ibverbs,
you need to recompile your application mostly to build in the mpi commands
as ibverbs translations. OR you just use IPoIB which is the most easiest. In
the OFED pakage is a IPoIB driver, the most application need this driver to
resolve hostnames etc. Look if you have an interface like ib0 when making
ifconfig. If you have, you can just run a normal namd (net or udp) over this
interfaces and you will use the infiniband as IPoIB without ibverbs.
Another thing with ibverbs and rdma is the permissions on the infiniband
device. Try if you can run the tests as root and as your user. If your user
cant access the device, you can't run parallel jobs, and you don't want to
run them as root. Check this please (/dev and udevrules) to make sure your
permissions allow usage of the infiniband.

In my opinion, as there's mostly no gain using ibverbs for a small number of
nodes, and even for cuda runs, use the IPoIB driver with the advices for
connected mode and mtu and you will be fine.

Mit freundlichen Grüßen

Norman Geist.

-----Ursprüngliche Nachricht-----
Von: David Hemmendinger []
Gesendet: Montag, 21. November 2011 17:04
Betreff: Re: AW: namd-l: namd-ibverbs fails to start

        I think that I wasn't clear enough, not being very familiar
with ibverbs infiniband. We are using RoCE, which, as I understand it,
runs the infiniband protocol over tcp. The IBM tech who installed it
on our cluster ran tests, which it passed, and I'm able to use it with
OpenMPI 1.4.2. My comment about the failure of ibv_rc_pingpong referred
only to what happened before I knew that I needed to specify -g GID,
a recently-added feature, according to Mellanox documentation.
        So my question was whether this new feature would mean that
I'd need to modify an ibverbs call in charmrun -- since we're running
over 10GB ethernet, we don't have an ib0.

>this doesn't seem to be a problem of namd or charmrun, rather than a
>of your infiniband configuration/installation. If the tests shipped with
>ofed fail, then there's something wrong. If you don't want to spend too
>time with the problem, use the ipoib driver to use the infiniband with ip
>traffic. Then u can just use a udp (faster) or tcp (NET version, usually
>slower than udp) version of namd over the ip over infiniband stack which
>me was faster than the native verbs. Another advantage is that you can use
>every possible mpi application like that also. Keep in mind to change the
>connection mode to connected, _not_ datagram and set the mtu to 65520.
>$> echo connected > /sys/class/net/ib0/mode
>$> ifconfig ib0 mtu 65520
>If that doesn't work also, something is wrong with your ofed or infiniband,
>then we can look further

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:57:55 CST