CUDA on GPU Cluster

From: Arvind Kannan (arvind_at_caltech.edu)
Date: Fri Aug 19 2011 - 17:31:06 CDT

Hello all,

I'm having trouble getting the CUDA version of NAMD to run on my lab's cluster. In particular, despite adding the path to the libcudart.so.2 file in the environment variable LD_LIBRARY_PATH, namd is still unable to locate the file. Calling env tells me that

LD_LIBRARY_PATH=/home/arvind/NAMD-CUDA:/opt/intel/Compiler/11.1/069/lib/intel64:/opt/cuda/lib64

The user's guide suggests that the problem might be that charmrun does not preserve the value of LD_LIBRARY_PATH when run without the ++local option. However, the variable remains unchanged after trying to run charmrun, so that doesn't seem to be the case. Additionally, the other shared libraries that namd finds successfully don't seem to come from the directories in the library pathfile:

        libdl.so.2 => /lib64/libdl.so.2 (0x000000312b400000)
        libcudart.so.2 => not found
        libm.so.6 => /lib64/libm.so.6 (0x000000312bc00000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x000000312c400000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x000000312d000000)
        libc.so.6 => /lib64/libc.so.6 (0x000000312b000000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x000000312b800000)
        /lib64/ld-linux-x86-64.so.2 (0x000000312ac00000)

I tried the user's guide's suggestion of putting

export LD_LIBRARY_PATH=/home/arvind/NAMD-CUDA:$LD_LIBRARY_PATH

in a runscript, and running charmun with the ++runscript option, but it didn't help. LD_LIBRARY_PATH is unchanged by the runscript when run by charmun, even though the script works fine when sourced from the command line.

Any suggestions would be greatly appreciated.

Thanks,
Arvind

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:20:43 CST