AW: Multinode NAMD CUDA GPU Selection

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Fri Aug 19 2011 - 00:53:31 CDT

Axel,

What I just mean, sorry for answering doubled, is that I don't understand
why it can't be implemented like this:

Lets say, and its really likely, namd already generates an array where PEs
and GPUs are assigned to each other. Why not generate this array like this:

- repeat the +device string till number of entries >= number of PEs
- then use PE-ID as index for +device list

This would be an absolutely clean solution, which works exactly like the
current implementation, but adds the possibility to do settings for special
cases. Don’t get me wrong, I don't want to criticize namd developers, I just
would have implemented this in that way, because: Why not! It's clean! It's
working for all possible cases! Does'nt matter if it is a homogenous
environment or not.

What do you think about that?

Norman Geist.

-----Ursprüngliche Nachricht-----
Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von Axel Kohlmeyer
Gesendet: Donnerstag, 18. August 2011 17:39
An: Norman Geist
Cc: Namd Mailing List
Betreff: Re: namd-l: Multinode NAMD CUDA GPU Selection

norman,

On Thu, Aug 18, 2011 at 2:25 AM, Norman Geist
<norman.geist_at_uni-greifswald.de> wrote:
> Hi experts,
>
> yesterday I have observed some, let’s say unfavorable behavior of namd
cuda
> job spawning. I was testing multinode gpu runs when finding out that namd

the fact, that nodes on a cluster are identical is a common and
very valid assumption made by many parallel applications. support
for in-homogeneous machines would make things _much_ more
complicated for very little gain.

> reads the +devices parameter from the beginning at every node, not
process,
> just on every node, namd starts to read the devices string from the start
> and so make it impossible to work with different nodes. Even if I have
> nodelist like:

you can try using nvidia-smi to set the GPU that you don't want to use
for namd to "compute disable" mode. never tried it with namd myself
(or needed to do it).

axel.

>
>
>
> host c35
>
> host c35
>
> host c35
>
> host c35
>
> host c33
>
> host c33
>
> host c33
>
> host c33
>
>
>
> And type a device string like +devices 1,1,1,1,0,0,0,0 he will try to use
> the gpu:1 on all processes. Is there any way to influence this without
> hacking the namd source? There have to be a possibility for such  things.
> Maybe if I have two nodes, one with a quadro and tesla and the other node
> only with one tesla, and I only want to use the tesla. Or I just don’t
want
> all gpus, because I want to run multible jobs on one machine. I already
have
> a script that would give the right gpu id for every node and generate such
a
> device string. But for that to work, the PEs must determine which gpu to
> bind by their real PE-ID, which works fine for one node, but not with
> multible nodes. That would be better and no big change to the current
> function.
>
> Please tell me your view.
>
> Thanks
>
> Norman Geist

-- 
Dr. Axel Kohlmeyer
akohlmey_at_gmail.com  http://goo.gl/1wk0
Institute for Computational Molecular Science
Temple University, Philadelphia PA, USA.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:57:36 CST