Re: Replica exchange simulation with GPU Accelaration

From: Souvik Sinha (souvik.sinha893_at_gmail.com)
Date: Fri Jan 26 2018 - 12:24:27 CST

Thanks for the replies. I get that in the present scenario it is gonna be
hard to get the gpu resources for my replica runs because of some
difficulty in the parallelisation scheme for gpu usage as MPI execution.

Is the replica exchange scheme for multiple walker ABF is differently
implimented than for metadynamics or other NAMD replica exchange
strategies? I am just curious because my understanding in this regard is
not much of a mark.
On 26 Jan 2018 20:43, "Giacomo Fiorin" <giacomo.fiorin_at_gmail.com> wrote:

> In general the multicore version (i.e. SMP with no network) is the best
> approach for CUDA, provided that the system is small enough. With nearly
> everything offloaded to the GPUs in the recent version, the CPUs are mostly
> idle, and adding more CPU cores only clogs up the motherboard bus.
>
> Running CUDA jobs in parallel, particularly with MPI, is a whole other
> endeavor.
>
> In Souvik's case, it is a setup that is difficult to run fast. You may
> consider using the multicore version for multiple-replicas metadynamics
> runs, which can communicate between replicas using files and do not need
> MPI.
>
> Giacomo
>
> On Thu, Jan 25, 2018 at 2:40 PM, Renfro, Michael <Renfro_at_tntech.edu>
> wrote:
>
>> I can’t speak for running replicas as such, but my usual way of running
>> on a single node with GPUs is to use the multicore-CUDA NAMD build, and to
>> run namd as:
>>
>> namd2 +setcpuaffinity +devices ${GPU_DEVICE_ORDINAL} +p${SLURM_NTASKS}
>> ${INPUT} >& ${OUTPUT}
>>
>> Where ${GPU_DEVICE_ORDINAL} is “0”, “1”, or “0,1” depending on which GPU
>> I reserve; ${SLURM_NTASKS} is the number of cores needed, and ${INPUT} and
>> ${OUTPUT} are the NAMD input file and the file to record standard output--001a11458c32bbd2550563b20178--

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2019 - 23:19:37 CST