Re: run multiple walkers

From: Giacomo Fiorin (giacomo.fiorin_at_temple.edu)
Date: Fri Mar 12 2010 - 16:03:18 CST

Hi Evelyn, how to assign the replicaID's is up to you. You can, for
example, write a single colvars configuration file, change the replicaID by
hand and save it to a new file each time. For example, write a file with
"replicaID" set to "r01" to a file named "meta.r01.colvars.in", a file with
"replicaID" set to "r02" to a file named "meta.r02.colvars.in", and so on.
 You'll then use each of these files in a different NAMD configuration file.

You should set replicaFilesRegistry to a file name with full path and equal
across all replicas. This way it will be readable by all replicas,
especially if you run each replica in a separate directory (which is a good
idea to handle less files).

One good thing is that you don't need to run all replicas at exactly the
same time. You can even launch each replica as a separate job in the queue
(which is way simpler than "packing" them into a super-job), and it will
take care of reading the other replicas' files by looking up their file
names from within replicaFilesRegistry.

Another good thing is that you can add more replicas as needed, later in the
simulation as you get more processors available, or stop running some
replicas if you get less.

The only bad thing is that all communication is done through files: I wrote
the code so that it stops reading from an incompletely written file, and
then checks again at the next update if more data has been written to the
same file. I tested it on different machines, and always worked well. But
each cluster/supercomputer handles file buffering differently: so I invite
you to check that all files listed in replicaFilesRegistry are open, written
to, and closed properly. For example, try "ls -l" to check that they all
have the same size after their jobs are completed. If NAMD ever complains
that it can't read properly from a file because it's corrupt, try to fix
that by removing its name from replicaFilesRegistry.

Just to be clear: there is no issue regarding file writing and reading
related to the multiple replicas scheme as opposed to the regular one, just
the normal issues regarding file management multiplied by the number of
replicas you chose (let's say, 10). If a certain machine has a certain
chance that a normal job will crash and write incomplete output files, that
chance will be multiplied by 10.

Grids will unfortunately be unavailable, so NAMD may be a little slower than
with a single replica. Also, to visualize the total PMF, you'll have to
gather all of the hills from the files listed in replicaFilesRegistry,
insert them into a single state file, and restart a single-replica
calculation, this time with multipleReplicas disabled and useGrids enabled.
 I just realized that this step is not properly documented, and will fix
that as soon as possible (probably uploading a script that collects those
files automatically for post-processing). However, it's not too difficult:
you just have to open of the state files with a text editor (their format is
very similar to the configuration file), and insert the contents of those
files into the "metadynamics" block, right after the "configuration" block.
 If you're unsure, run a very short job with a single replica and "useGrids"
disabled to produce a "legal" state file to take inspiration from.

I hope that this information is enough to guide you. Luckily, it'll look a
little less complicated than I could describe in words once the files start
to be created.

Giacomo

---- ----
Giacomo Fiorin
  ICMS - Institute for Computational Molecular Science
    Temple University
    1900 N 12 th Street, Philadelphia, PA 19122
work phone: (+1)-215-204-4216
mobile: (+1)-267-324-7676
mail: giacomo.fiorin_at_gmail.com
---- ----

On Thu, Mar 11, 2010 at 5:15 PM, Naiyin Yu <blueyny_at_hotmail.com> wrote:

> Dear All,
>
> I am new to NAMD, while I am going to run multiple walkers metadynamics
> using NAMD. So in the script, I should turn on the "*multipleReplicas*"
> option. And is there anything else I should pay attention to, such as when I
> submit the job to the cluster? ie How to assign the replica_id? And did
> anyone happen to have a sample script so I could employed?
>
> thanks a lot!
>
> Evelyn
>
> ------------------------------
> Your E-mail and More On-the-Go. Get Windows Live Hotmail Free. Sign up
> now. <http://clk.atdmt.com/GBL/go/201469229/direct/01/>
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:53:53 CST