megatest test failure/MPI problem

From: Meij, Henk (hmeij_at_wesleyan.edu)
Date: Wed Oct 15 2008 - 10:14:48 CDT

cluster: redhat linux AS4 x86_64 with 2.6.9-34 kernel
namd: 2.6 source, trying to compile linux-amd64-MPI with gcc
mpi: 2 flavors (topspin infiniband libs came with cluster), openmpi (1.2 compiled with gigE and Infiniband libs).

i'm trying to pass the megatest and detail my steps below. when i get to invoke pgm i run into a problem that i do not encounter when invoking other problems. seems basic but i can not find a way out. (invoking mpirun directly as i'm running LSF 6.2).

-Henk

pwd
/share/apps/NAMD
tar zxvf /share/apps/src/fftw-linux-amd64.tar.gz
vi fftw/linux-amd64/arch/Linux-amd64.fftw # fix path
tar zxvf /share/apps/src/tcl-linux-amd64.tar.gz
vi tcl/linux-amd64/arch/Linux-amd64.fftw # fix path
tar zxvf /share/apps/src/NAMD_2.6_Source.tar.gz
cd NAMD_2.6_Source/
not edits in arch/Linux-amd64-MPI.arch
cd charm-5.9/
vi src/arch/mpi-linux-amd64/conv-mach.sh # point to Topspin's or Openmpi's mpirun
/usr/local/topspin/mpi/mpich/bin/mpiCC -show 2>/dev/null | cut -d' ' -f1 # returns g++
/share/apps/openmpi-1.2/bin/mpiCC -show 2>/dev/null | cut -d' ' -f1 # returns g++
# no changes in src/arch/common/
./build charm++ mpi-linux-amd64
# charm++ built successfully.
cd mpi-linux-amd64/tests/charm++/megatest/
make # no errors

# first attempt, missing libs using Topspin
[root_at_swallowtail NAMD]# echo $LD_LIBRARY_PATH
/opt/lsfhpc/6.2/linux2.6-glibc2.3-x86_64/lib:/usr/local/topspin/mpi/mpich/lib64
[root_at_swallowtail megatest]# ldd pgm
        libmpich.so => /usr/local/topspin/mpi/mpich/lib64/libmpich.so (0x0000002a95557000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003684000000)
        libmpi_cxx.so.0 => not found
        libmpi.so.0 => /opt/lam/gnu/lib/libmpi.so.0 (0x0000002a97797000)
        libopen-rte.so.0 => not found
        libopen-pal.so.0 => not found
        librt.so.1 => /lib64/tls/librt.so.1 (0x0000003689000000)
        libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x0000002a9790f000)
        libnsl.so.1 => /lib64/libnsl.so.1 (0x0000003686d00000)
        libutil.so.1 => /lib64/libutil.so.1 (0x0000003688600000)
        libm.so.6 => /lib64/tls/libm.so.6 (0x00000034d3600000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00000034d3800000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003687b00000)
        libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x0000003684400000)
        libc.so.6 => /lib64/tls/libc.so.6 (0x0000003683b00000)
        libg2c.so.0 => /usr/lib64/libg2c.so.0 (0x00000039aa100000)
        libvapi.so => /usr/local/topspin/mpi/mpich/lib64/libvapi.so (0x0000002a97a17000)
        libmosal.so => /usr/local/topspin/mpi/mpich/lib64/libmosal.so (0x0000002a97b37000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003683900000)

# second attempt with OpenMPI
[root_at_swallowtail megatest]# echo $LD_LIBRARY_PATH
/opt/lsfhpc/6.2/linux2.6-glibc2.3-x86_64/lib:/share/apps/openmpi-1.2/lib
[root_at_swallowtail megatest]# ldd ./pgm
        libmpich.so => /usr/local/topspin/mpi/mpich/lib64/libmpich.so (0x0000002a95576000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003684000000)
        libmpi_cxx.so.0 => /share/apps/openmpi-1.2/lib/libmpi_cxx.so.0 (0x0000002a97797000)
        libmpi.so.0 => /share/apps/openmpi-1.2/lib/libmpi.so.0 (0x0000002a978ba000)
        libopen-rte.so.0 => /share/apps/openmpi-1.2/lib/libopen-rte.so.0 (0x0000002a97a4e000)
        libopen-pal.so.0 => /share/apps/openmpi-1.2/lib/libopen-pal.so.0 (0x0000002a97ba7000)
        librt.so.1 => /lib64/tls/librt.so.1 (0x0000003689000000)
        libnuma.so.1 => /usr/lib64/libnuma.so.1 (0x0000002a97d03000)
        libnsl.so.1 => /lib64/libnsl.so.1 (0x0000003686d00000)
        libutil.so.1 => /lib64/libutil.so.1 (0x0000003688600000)
        libm.so.6 => /lib64/tls/libm.so.6 (0x00000034d3600000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x00000034d3800000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003687b00000)
        libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x0000003684400000)
        libc.so.6 => /lib64/tls/libc.so.6 (0x0000003683b00000)
        libg2c.so.0 => /usr/lib64/libg2c.so.0 (0x00000039aa100000)
        libvapi.so => /usr/local/topspin/mpi/mpich/lib64/libvapi.so (0x0000002a97e0b000)
        libmosal.so => /usr/local/topspin/mpi/mpich/lib64/libmosal.so (0x0000002a97f2b000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003683900000)

# run pgm on infiniband enabled node, create a file with 4 lines of node name 'compute-1-1'
# using OpenMPI

[root_at_swallowtail megatest]# /share/apps/openmpi-1.2/bin/mpirun_ssh -np 4
-machinefile
/share/apps/NAMD/NAMD_2.6_Source/charm-5.9/mpi-linux-amd64/tests/charm++/megatest/nodelist.txt
/share/apps/NAMD/NAMD_2.6_Source/charm-5.9/mpi-linux-amd64/tests/charm++/megatest/pgm
Can't read MPIRUN_HOST
Can't read MPIRUN_HOST
Can't read MPIRUN_HOST
Can't read MPIRUN_HOST
[root_at_swallowtail megatest]# cat
/share/apps/NAMD/NAMD_2.6_Source/charm-5.9/mpi-linux-amd64/tests/charm++/megatest/nodelist.txt
compute-1-1
compute-1-1
compute-1-1
compute-1-1

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:49:58 CST