From: Siva Dasetty (sdasett_at_g.clemson.edu)
Date: Wed Sep 17 2014 - 09:36:03 CDT
Thank you all for the reply.
James- I am comparing with the following paper where they have used more
than 2 Gpu's for a much smaller system.
David- I tried turning on twoAwayX and it only improved the number of
processors from 10 to 15.
Jim- No, the protein is not restrained and the speeds are the one I am
reporting from the three benchmark lines provided in the log file.
On Wed, Sep 10, 2014 at 5:32 PM, Jim Phillips <jim_at_ks.uiuc.edu> wrote:
> Does the simulation slow down during the run? Unless your protein is
> somehow restrained (e.g., by MDFF grid forces) it will likely drift off of
> the initial patch grid.
> On Tue, 9 Sep 2014, Siva Dasetty wrote:
> Dear All,
>> I am running a simulation of a small protein (4K atoms) in implicit
>> using cuda enabled NAMD and I only get 2-3 ns/day computational speed when
>> I am using a single node with 10 (or less) processors and 2 gpus (k20's).
>> However if I try to increase the number of nodes, I get the following
>> "CUDA-enabled NAMD requires at least one patch per process."
>> and if I increase the number of processors >10 in a single node, I also
>> the following note "MPI_ABORT was invoked on rank 4 in communicator
>> with errorcode 1." along with the above fatal error.
>> I tried to follow previous threads in the archive list but couldn't
>> understand anything much about this (
>> Is this because my system size is too small ? or am I doing something
>> absurd while executing the job?
>> I have seen better benchmarks reported by NVDIA (
>> http://www.nvidia.com/docs/IO/122634/NAMD-benchmark-report.pdf) and
>> others (
This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:22:51 CST