From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Fri May 25 2012 - 05:39:37 CDT
On Thu, May 24, 2012 at 5:51 AM, Benjamin Merget
<benjamin.merget_at_uni-wuerzburg.de> wrote:
> Hi @all,
>
> We are running a cluster with 4 24-core CPU-only nodes and recently bought a
> a GPU/CPU node with 4 Tesla cards and 8 CPU cores. All machines are running
> the Precise Pangolin (64-bit Server) and our queueing system is Torque
> 3.0.4.
>
> Since I wanted to make use of the GPUs, I built an MPI-CUDA version of NAMD.
> My problem is, however, when I try submit a job to all resources, it crashes
> with the fatal error:
>
> CUDA error in cudaGetDeviceCount on Pe XX (nodeXX): no CUDA-capable device
> is detected
>
> And this for each process on each CPU-only node...
>
> Is there a way to tell NAMD not to look for CUDA devices on the CPU nodes
> (since there obviously are none), but instead only use the GPUs of our
> CPU/GPU node, so that I could use all CPU nodes and the GPU node together?
no, and it is not worth it. just run one calculation on the GPU node
and a second on the rest and enjoy efficient utilization of your hardware.
anything else is just wasting your time.
axel.
>
> I recently read about remote CUDA (rCUDA). This way, the each CPU-only node
> could utilize the 4 Tesla cards on the GPU node remotely as well through
> Ethernet or InfiniBand. Might this be a solution, or is there a much simpler
> way and I just don't see the forest for the trees?
>
> If rCUDA indeed is the solution for all this, are there any experiences with
> rCUDA and NAMD, because I have absolutely no clue how this could be
> implemented into the code.
>
>
> Thanks very much in advance!
> Benjamin
>
-- Dr. Axel Kohlmeyer akohlmey_at_gmail.com http://goo.gl/1wk0 College of Science and Technology Temple University, Philadelphia PA, USA.
This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:33 CST