Re: REMD across gpus

From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Tue May 10 2011 - 15:52:42 CDT

On Tue, May 10, 2011 at 4:29 PM, Massimiliano Porrini
<M.Porrini_at_ed.ac.uk> wrote:
> Dear Axel,
>
> Thanks for the useful reply.
>
> Only one clarification:
>
> Your word "mess" was referred only to the hybrid (gpu/cpu) simulation
> or also to the REMD across multiple GPUs ?

mostly.

> I mean, is REMD across GPUs easily doable?

i don't know. as far as i know, the NAMD implementation
is entirely based on Tcl script code using a socket interface.
that is an elegant way of implementing REMD without having
the rewrite the executable. i did something similar a while ago
in python using pypar and HOOMD and it was fun to see it
work, but somehow a bit unsatisfactory.

i know of other codes that have such features and other
multi-replica method implemented on a lower level and
they appear to me more elegant to use. that may be
the price to pay.

cheers,
    axel.

> Thanks again,
> MP
>
> PS: by the way I am already running REMD on our cluster, but it is
> "slightly" slow.
>
>
> 2011/5/10 Axel Kohlmeyer <akohlmey_at_gmail.com>:
>> On Mon, May 9, 2011 at 10:28 AM, Massimiliano Porrini
>> <M.Porrini_at_ed.ac.uk> wrote:
>>> Dear all,
>>>
>>> I am attempting to run replica exchange MD across my three GPUs
>>> (two Teslas C2070 and one GTX 470).
>>>
>>> They are all inside my GPU workstation, hence I have one node with three GPUs.
>>>
>>> I am using the example files for the deca-alanine folding and I have
>>> set a number of
>>> replica equal to 30 and have added the flags "+devices 0,1,2" and
>>> "+idlepoll" in the
>>
>> the problem is the devices flag. you will have to alternate between
>> "+devices 0" " +devices 1" and "+devices 2" to have the jobs scattered
>> across the GPUs.
>>
>> also, if you want to do hybrid (cpu/gpu) you'd have to have two sets
>> of namd binaries (with and without GPU support) and figure out how
>> many CPU cores you have to use to have a runtime comparable to
>> one CPU core plus GPU.
>>
>> i'd rather get time on a simple cluster and forget about this mess.
>>
>> axel.
>>
>>> fold_alanin.conf file, as you can see:
>>>
>>>  set spawn_namd_command [list spawn_namd_simple "[file join
>>> $namd_bin_dir namd2] +devices 0,1,2 +idlepoll +netpoll"]
>>>
>>> I thought to have 10 replicas per GPU, but unfortunately I did not.
>>> Apparently the above modifications are not enough.
>>>
>>> Indeed, when I checked which and how many GPUs are actually being used
>>> (via the command nvidia-smi -a) I see that only the first typed GPU is
>>> used, in the
>>> above case the number 0 (GTX480).
>>> Alike if I typed "+devices 1,2" the script will use only the number 1
>>> (i.e. one Tesla C2070).
>>>
>>> Any hint/suggestion would be really appreciated.
>>>
>>>
>>> At this point, another (most likely daft) question came up to me:
>>>
>>> Is possible to run an "hybrid" simulation across GPU and CPU cores?
>>> I mean: would NAMD be able to run a REMD simulation with e.g. 32
>>> replica, where 8 replica are run
>>> using the CPU, 8 using the GTX, 8 on the Tesla and the last 8 on the
>>> only left Tesla?
>>>
>>> Thanks a lot in advance.
>>>
>>> All the best,
>>> MP
>>>
>>>
>>>>
>>>> --
>>>> Dr Massimiliano Porrini
>>>> P. E. Barran Research Group
>>>> Institute for Condensed Matter and Complex Systems
>>>> School of Physics & Astronomy
>>>> The University of Edinburgh
>>>> James Clerk Maxwell Building
>>>> The King's Buildings
>>>> Mayfield Road
>>>> Edinburgh EH9 3JZ
>>>>
>>>> Tel +44-(0)131-650-5229
>>>>
>>>> E-mails : M.Porrini_at_ed.ac.uk
>>>>              mozz76_at_gmail.com
>>>>              maxp_at_iesl.forth.gr
>>>>
>>>>
>>>>
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
>>>>
>>>
>>>
>>>
>>>
>>> --
>>> Dr Massimiliano Porrini
>>> P. E. Barran Research Group
>>> Institute for Condensed Matter and Complex Systems
>>> School of Physics & Astronomy
>>> The University of Edinburgh
>>> James Clerk Maxwell Building
>>> The King's Buildings
>>> Mayfield Road
>>> Edinburgh EH9 3JZ
>>>
>>> Tel +44-(0)131-650-5229
>>>
>>> E-mails : M.Porrini_at_ed.ac.uk
>>>              mozz76_at_gmail.com
>>>              maxp_at_iesl.forth.gr
>>>
>>>
>>
>>
>>
>> --
>> Dr. Axel Kohlmeyer
>> akohlmey_at_gmail.com  http://goo.gl/1wk0
>>
>> Institute for Computational Molecular Science
>> Temple University, Philadelphia PA, USA.
>>
>>
>
>
>
> --
> Dr Massimiliano Porrini
> P. E. Barran Research Group
> Institute for Condensed Matter and Complex Systems
> School of Physics & Astronomy
> The University of Edinburgh
> James Clerk Maxwell Building
> The King's Buildings
> Mayfield Road
> Edinburgh EH9 3JZ
>
> Tel +44-(0)131-650-5229
>
> E-mails : M.Porrini_at_ed.ac.uk
>              mozz76_at_gmail.com
>              maxp_at_iesl.forth.gr
>

-- 
Dr. Axel Kohlmeyer
akohlmey_at_gmail.com  http://goo.gl/1wk0
Institute for Computational Molecular Science
Temple University, Philadelphia PA, USA.

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:20:14 CST