Re: Restarting Multiple Walker/Well Tempered Metadynamics

From: Giacomo Fiorin (giacomo.fiorin_at_gmail.com)
Date: Thu Oct 25 2012 - 16:18:49 CDT

Hello Jeff, are you getting any warnings when you restart (i.e. at 5 ns)
about the fact that one replica can't read the state files generated by the
other replicas?

I assume that all replicas have access to the files of the other replicas,
generated in the current job and in the previous one as well. Could you
tell what kind of file system is hosting the files? When one replica
starts, the first thing before doing anything is to write an updated state
file for the other replicas to use: it may be possible that this file is
incomplete when other replicas try to access it, or that it is complete but
the networked file system is slow in catching on.

If you find a recurrent issue of this kind, I can send you a modified code
to keep trying to read the other replicas' files at the beginning of the
job, and then resume the long interval between files that you set.

The well tempered correction for the multiple replicas case is at line 1568.

bests
Giacomo

On Thu, Oct 25, 2012 at 2:17 PM, Jeff Wereszczynski <
jmweresz_at_mccammon.ucsd.edu> wrote:

> Hi NAMD List,
>
> I have another question about using multiple walker/well tempered
> metadynamics in NAMD. Briefly, here is the metadynamics portion of my
> colvars file:
>
>
> ---------------------------------------------------------------------------------------------
> colvarsRestartFrequency 250000
>
> ....
> metadynamics {
> name metad
> colvars colvar1 colvar2
> hillWeight .3
> hillWidth 10
> newHillFrequency 100
> replicaUpdateFrequency 1000
>
> wellTempered on
> biasTemperature 1600
>
> multipleReplicas on
> replicaID 00
> replicasRegistry
> /scratch/scratchdirs/jmweresz/srta/meta_1600/repfile
>
> dumpFreeEnergyFile on
> usegrids on
> saveFreeEnergyFile on
> dumpPartialFreeEnergyFile on
> }
>
>
> ---------------------------------------------------------------------------------------------
>
> I have a total of 21 replicas running, which all have identical colvar
> files except for the "replicaID" field. I have been running simulations in
> 5 ns increments and I have each of the replicas running at the same time.
> So I have all 21 replicas running from 0-5 ns using their own "input1.inp"
> files, waiting at the 5 ns point for any slow replicas to catch up (a short
> wait), and then running from 5-10 ns with "input2.inp" files. I also set
> the colvarsRestartFrequency to only write restart files and PMFs every 500
> ps.
>
> Everything proceeds normally for the first 5 ns, but my issue shows up
> when I start the second input file. From what I can tell, when input2.inp
> starts up for each replica, it reads the colvars.state file properly for
> that replica, but it does not read the colvars.state files for the other
> replicas. The result of this is that for the time between when I start up
> that second input file and when it hits a "colvars restart" interval,
> sampling in each replica contains hills from the initial run of that
> replica but not from the other replicas. As the simulation proceeds, hills
> from the other replicas are added, but the full sampling history of the
> other replicas are not added until I hit a colvars restart interval.
>
> If that is confusing, maybe this example makes more sense. With each
> input file going for 5 ns, and a colvarrestart frequency of 500 ps, an
> individual replica sees hills from previous sampling times likes this:
>
> Time Hills from this replica Hills from other replica
> 5 ns 0-5 ns 0-5
> ns
> 5.5 ns 0-5.5 ns *5-5.5
> ns*
> 6 ns 0-6 ns 0-6
> ns
>
> This problem is quite clear when I look at the PMFs at time 5.5 ns, as the
> PMFs should be nearly identical between replicas, but they are not. In
> fact, they mostly look like the "partial" PMFs (since that is where most of
> their sampling comes from). At 6 ns though, the PMF files look nearly
> identical between replicas.
>
> Overall this creates a convergence problem, as there is a chunk of time in
> which metadynamics is not properly sampling the underlying PMF. One way to
> minimize this is to reduce colvarsRestartFrequency to a small value (say
> 500) so that only a few hills are added on the incorrect metadynamics
> landscape, but this isn't ideal as then I get a very large number of PMF
> files and I'm wasting some time unnecessarily writing restart and PMF
> files. So this seems like a bug to me, but perhaps there's something I'm
> doing wrong?
>
> Also, while I have your ear, I wanted to make sure that the PMFs are being
> properly computed. Looking in the colvarbias_meta.C file, I can see on
> line 1512-1517 how the scaling factor for the well tempered algorithm is
> being applied to the single replica case by defining the multiply_constant
> variable to -1*(DT+T)/T if well tempered is used and -1 if it isn't, but it
> looks to me that in the multiple replica case that it is only ever set to
> -1 (line 1539), which would mean that the scaling factor is not being
> applied in this case. Of course, I could be reading this file wrong and
> there could be a correction elsewhere, so I just wanted to check on this.
>
> Thanks for your help!
>
> --
> Jeff Wereszczynski
> Postdoctoral Scholar
> University of California, San Diego
> http://mccammon.ucsd.edu/~jwereszc
>
>

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:22:42 CST