Re: problem restarting multiple walker metadynamics

From: Giacomo Fiorin (giacomo.fiorin_at_gmail.com)
Date: Wed Apr 01 2015 - 15:05:30 CDT

Hello Amy, I tested NAMD 2.9 and I couldn't reproduce the error that you
describe.

I recommend that you run the attached test suite. First, run both replicas
(configuration files test.rep1.namd and test.rep2.namd), then run them
again by restarting from their previous configurations (configuration files
testres.rep1.namd and testres.rep2.namd).

You should keep the .txt files: these do not contain any physical
information, nor are they updated often, but they are necessary to keep
track of the communication between the two replicas.

Giacomo

On Mon, Mar 30, 2015 at 2:23 PM, Amy Rice <arice3_at_hawk.iit.edu> wrote:

> Hi Giacomo,
> I am using NAMD 2.9 and version 2012-03-23 of the colvars module. Thanks!
>
> On Sat, Mar 28, 2015 at 5:50 PM, Giacomo Fiorin <giacomo.fiorin_at_gmail.com>
> wrote:
>
>> Hello Amy, before suggesting a solution I'd need the version of NAMD and
>> the version of the colvars module you were using.
>>
>> There have been several changes recently, both in the colvars module and
>> in how files are read and written by NAMD, which affect dramatically the
>> best solution to your problem.
>>
>> Can you please email the version numbers? Thanks!
>>
>>
>> Giacomo
>>
>>
>> On Fri, Mar 27, 2015 at 12:48 PM, Amy Rice <arice3_at_hawk.iit.edu> wrote:
>>
>>> Thank you, I appreciate the help. In the meantime, is there any sort of
>>> work around that you are aware of? I've been considering trying to somehow
>>> combine the information from all of my .state files and restart the run
>>> from this (changing the colvar to the proper value for each replica), but
>>> I'm afraid this might affect the results obtained.
>>>
>>> On Thu, Mar 19, 2015 at 6:39 PM, Giacomo Fiorin <
>>> giacomo.fiorin_at_gmail.com> wrote:
>>>
>>>> Ok I'll look into the problem more closely.
>>>>
>>>> Giacomo
>>>> On Mar 19, 2015 11:44 PM, "Amy Rice" <arice3_at_hawk.iit.edu> wrote:
>>>>
>>>>> I have reproduced the restart error multiple times, on gordon (SDSC)
>>>>> and our group's local cluster, both with implicit and explicit solvent. The
>>>>> only time that a metadynamics run has been restarted correctly (that
>>>>> preserves the information from the initial run) was while using a single
>>>>> replica.
>>>>>
>>>>> On Thu, Mar 19, 2015 at 5:26 PM, Giacomo Fiorin <
>>>>> giacomo.fiorin_at_gmail.com> wrote:
>>>>>
>>>>>> You shouldn't have to enable keepHills to continue a simulation.
>>>>>> Have you tried to reproduce the error, i.e. have you tried to run jobs 1
>>>>>> and 2 consecutively? If they are too expensive to recalculate, try with a
>>>>>> short job of 10-20 ps to reproduce the problem.
>>>>>>
>>>>>> Giacomo
>>>>>>
>>>>>> On Thu, Mar 19, 2015 at 11:24 PM, Amy Rice <arice3_at_hawk.iit.edu>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Giaocmo,
>>>>>>> My apologies for the delayed response. All of the replicas are
>>>>>>> accessing the same registry file, and the the registry itself is giving the
>>>>>>> correct path to the state and hills files for each replica. The state files
>>>>>>> are all up-to-date and have the expected step number for the end of the
>>>>>>> original simulation. I noticed that the registry is also pointing to
>>>>>>> ".hills" files for each replica; however, I didnt use the "keepHills"
>>>>>>> option so these files are all empty. Is it possible this is the source of
>>>>>>> the problem?
>>>>>>> Thank you,
>>>>>>> - Amy
>>>>>>>
>>>>>>> On Sun, Mar 15, 2015 at 10:47 AM, Giacomo Fiorin <
>>>>>>> giacomo.fiorin_at_gmail.com> wrote:
>>>>>>>
>>>>>>>> Sorry for the incomplete email. The keyword replicasRegistry
>>>>>>>> indicates the path to a text file where the latest version of the state
>>>>>>>> file for each replica can be found.
>>>>>>>>
>>>>>>>> Can you check if all replicas see the same registry file, and that
>>>>>>>> the contents of it are up-to-date with the final snapshot of your
>>>>>>>> simulation?
>>>>>>>>
>>>>>>>> Giaocmo
>>>>>>>>
>>>>>>>> On Sun, Mar 15, 2015 at 4:45 PM, Giacomo Fiorin <
>>>>>>>> giacomo.fiorin_at_gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello Amy, the replicas all share the same bias: from the point of
>>>>>>>>> view of each replica, there are 300 kcal/mol of biasing energy in 7.5 ns.
>>>>>>>>> You have not simulated any of those systems longer than 7.5 ns, so that is
>>>>>>>>> the time during which the bias was added.
>>>>>>>>>
>>>>>>>>> You can find the documentation for multiple-replicas metadynamics
>>>>>>>>> here:
>>>>>>>>>
>>>>>>>>> http://www.ks.uiuc.edu/Research/namd/2.10/ug/node58.html#SECTION000135240000000000000
>>>>>>>>>
>>>>>>>>> Note that this feature doesn't use yet the replica-exchange syntax
>>>>>>>>> that is also used in scripts in the lib/replicas folder. It uses temporary
>>>>>>>>> files that are exchanged at regular intervals, not too frequently and
>>>>>>>>> asynchronously.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Mar 12, 2015 at 11:26 PM, Amy Rice <arice3_at_hawk.iit.edu>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> To clarify- the initial simulation was done for 7.5ns per
>>>>>>>>>> replica, so this corresponds to ~300 kcal/mol of external potential being
>>>>>>>>>> added over a total of 210ns. My apologies for not making this clearer in
>>>>>>>>>> the original message!
>>>>>>>>>> - Amy
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 11, 2015 at 10:49 PM, Amy Rice <arice3_at_hawk.iit.edu>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> As far as I can tell, there were no instabilities along the way.
>>>>>>>>>>> There were no error messages reported, and the pmfs generated by each
>>>>>>>>>>> replica appear to align. Additionally, all 28 replicas ended normally after
>>>>>>>>>>> 7.5 ns and generated the expected files (coor, vel, dcd, colvars.state,
>>>>>>>>>>> colvars.traj, etc.). Is there anything else I can check to verify that
>>>>>>>>>>> there were no instabilities in the initial run?
>>>>>>>>>>> Thank you for the response,
>>>>>>>>>>> - Amy Rice
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 11, 2015 at 6:29 PM, Giacomo Fiorin <
>>>>>>>>>>> giacomo.fiorin_at_gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Did all the replicas in the first job conclude gracefully? I
>>>>>>>>>>>> want to point your attention to the fact that you have added 300 kcal/mol
>>>>>>>>>>>> of external potential in 7.5 ns, and I'm not sure there weren't any
>>>>>>>>>>>> instabilities along the way.
>>>>>>>>>>>>
>>>>>>>>>>>> Giacomo
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Mar 11, 2015 at 8:59 PM, Amy Rice <arice3_at_hawk.iit.edu>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>> I am running multiple walker/well tempered metadynamics; I
>>>>>>>>>>>>> have 28 walkers and would ultimately like to run for 15ns each. Due to
>>>>>>>>>>>>> walltime limitations, I have to restart the simulation after 7.5ns, which
>>>>>>>>>>>>> is where the problem is occurring. After inspecting the pmfs generated
>>>>>>>>>>>>> after restarting, it seems to me that the colvar state information from
>>>>>>>>>>>>> before the restart is not being included, and that a new pmf is being
>>>>>>>>>>>>> generated instead. However, the log file shows that the colvars.state file
>>>>>>>>>>>>> is being read:
>>>>>>>>>>>>>
>>>>>>>>>>>>> colvars:
>>>>>>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>>>>>> colvars: Collective variables biases initialized, 1 in total.
>>>>>>>>>>>>> colvars:
>>>>>>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>>>>>> colvars: Restarting from file "01/meta.KR12.colvars.state".
>>>>>>>>>>>>> colvars: Restarting collective variable "alpha" from value:
>>>>>>>>>>>>> 0.050172
>>>>>>>>>>>>> colvars:
>>>>>>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>>>>>> colvars: Collective variables module initialized.
>>>>>>>>>>>>> colvars:
>>>>>>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here is the relevant region of one of the restart
>>>>>>>>>>>>> configuration files:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ------------------------------------------------------------------------------------
>>>>>>>>>>>>> structure KR12.ionized.psf
>>>>>>>>>>>>> coordinates KR12.ionized.pdb
>>>>>>>>>>>>> bincoordinates 01/meta.KR12.coor
>>>>>>>>>>>>> extendedsystem 01/meta.KR12.xsc
>>>>>>>>>>>>> binvelocities 01/meta.KR12.vel
>>>>>>>>>>>>>
>>>>>>>>>>>>> [....]
>>>>>>>>>>>>>
>>>>>>>>>>>>> #colvars
>>>>>>>>>>>>> colvars on
>>>>>>>>>>>>> colvarsConfig alpha01.in
>>>>>>>>>>>>> colvarsInput 01/meta.KR12
>>>>>>>>>>>>>
>>>>>>>>>>>>> -------------------------------------------------------------------------------------
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> and one of the colvar configuration files:
>>>>>>>>>>>>>
>>>>>>>>>>>>> -------------------------------------------------------------------------------------
>>>>>>>>>>>>> colvarsTrajFrequency 5000
>>>>>>>>>>>>>
>>>>>>>>>>>>> colvar {
>>>>>>>>>>>>> name alpha
>>>>>>>>>>>>> width 0.005
>>>>>>>>>>>>>
>>>>>>>>>>>>> lowerboundary 0.0
>>>>>>>>>>>>> upperboundary 1.0
>>>>>>>>>>>>>
>>>>>>>>>>>>> lowerwallconstant 10
>>>>>>>>>>>>> upperwallconstant 10
>>>>>>>>>>>>>
>>>>>>>>>>>>> alpha {
>>>>>>>>>>>>> residueRange 19-28
>>>>>>>>>>>>> psfSegID P1
>>>>>>>>>>>>> }
>>>>>>>>>>>>> }
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> metadynamics {
>>>>>>>>>>>>> colvars alpha
>>>>>>>>>>>>> hillWeight 0.5
>>>>>>>>>>>>> newHillFrequency 100
>>>>>>>>>>>>> hillwidth 2.5066
>>>>>>>>>>>>> wellTempered on
>>>>>>>>>>>>> biasTemperature 3000
>>>>>>>>>>>>> saveFreeEnergyFile on
>>>>>>>>>>>>> writeHillsTrajectory on
>>>>>>>>>>>>> multipleReplicas on
>>>>>>>>>>>>> ReplicaID 1
>>>>>>>>>>>>> replicasRegistry
>>>>>>>>>>>>> /oasis/scratch/arice3/temp_project/first/registry
>>>>>>>>>>>>> replicaUpdatefrequency 1000
>>>>>>>>>>>>> dumpPartialFreeEnergyFile on
>>>>>>>>>>>>>
>>>>>>>>>>>>> -------------------------------------------------------------------------------------
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Here is the final pmf generated before the restart (after
>>>>>>>>>>>>> 7.5ns):
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://drive.google.com/file/d/0B2-4_f9dh-l2cllJa2QxaDZzQWc/view?usp=sharing
>>>>>>>>>>>>>
>>>>>>>>>>>>> The first three pmfs generated post-restart, same scale:
>>>>>>>>>>>>> (red is after 0.1ns of the restarted run, green is 0.2ns, and
>>>>>>>>>>>>> blue is 0.3ns)
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://drive.google.com/file/d/0B2-4_f9dh-l2NzljbVZITzdlQ0U/view?usp=sharing
>>>>>>>>>>>>>
>>>>>>>>>>>>> Last pmf generated before the restart (pink/purple) and the
>>>>>>>>>>>>> first two after restarting (red and green):
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://drive.google.com/file/d/0B2-4_f9dh-l2b3VDTDhDQ0Jsdjg/view?usp=sharing
>>>>>>>>>>>>>
>>>>>>>>>>>>> As I said, it seems to me that the information from before the
>>>>>>>>>>>>> restart is not being included. Is there a different way to restart multiple
>>>>>>>>>>>>> walker metadynamics runs, or perhaps an option that I neglected to include
>>>>>>>>>>>>> in my configuration files?
>>>>>>>>>>>>> Thank you for your help!
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Amy Rice
>>>>>>>>>>>>> Ph.D. Student
>>>>>>>>>>>>> Physics Department
>>>>>>>>>>>>> Illinois Institute of Technology
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Giacomo Fiorin
>>>>>>>>>>>> Assistant Professor of Research
>>>>>>>>>>>> Institute for Computational Molecular Science (ICMS)
>>>>>>>>>>>> College of Science and Technology, Temple University
>>>>>>>>>>>> 1925 North 12th Street (035-07), Room 704D
>>>>>>>>>>>> Philadelphia, PA 19122-1801
>>>>>>>>>>>> Phone: +1-215-204-4213
>>>>>>>>>>>> https://icms.cst.temple.edu/members.html
>>>>>>>>>>>> http://giacomofiorin.github.io/
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Amy Rice
>>>>>>>>>>> Ph.D. Student
>>>>>>>>>>> Physics Department
>>>>>>>>>>> Illinois Institute of Technology
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Amy Rice
>>>>>>>>>> Ph.D. Student
>>>>>>>>>> Physics Department
>>>>>>>>>> Illinois Institute of Technology
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Giacomo Fiorin
>>>>>>>>> Assistant Professor of Research
>>>>>>>>> Institute for Computational Molecular Science (ICMS)
>>>>>>>>> College of Science and Technology, Temple University
>>>>>>>>> 1925 North 12th Street (035-07), Room 704D
>>>>>>>>> Philadelphia, PA 19122-1801
>>>>>>>>> Phone: +1-215-204-4213
>>>>>>>>> https://icms.cst.temple.edu/members.html
>>>>>>>>> http://giacomofiorin.github.io/
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Giacomo Fiorin
>>>>>>>> Assistant Professor of Research
>>>>>>>> Institute for Computational Molecular Science (ICMS)
>>>>>>>> College of Science and Technology, Temple University
>>>>>>>> 1925 North 12th Street (035-07), Room 704D
>>>>>>>> Philadelphia, PA 19122-1801
>>>>>>>> Phone: +1-215-204-4213
>>>>>>>> https://icms.cst.temple.edu/members.html
>>>>>>>> http://giacomofiorin.github.io/
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Amy Rice
>>>>>>> Ph.D. Student
>>>>>>> Physics Department
>>>>>>> Illinois Institute of Technology
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Giacomo Fiorin
>>>>>> Assistant Professor of Research
>>>>>> Institute for Computational Molecular Science (ICMS)
>>>>>> College of Science and Technology, Temple University
>>>>>> 1925 North 12th Street (035-07), Room 704D
>>>>>> Philadelphia, PA 19122-1801
>>>>>> Phone: +1-215-204-4213
>>>>>> https://icms.cst.temple.edu/members.html
>>>>>> http://giacomofiorin.github.io/
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Amy Rice
>>>>> Ph.D. Student
>>>>> Physics Department
>>>>> Illinois Institute of Technology
>>>>>
>>>>
>>>
>>>
>>> --
>>> Amy Rice
>>> Ph.D. Student
>>> Physics Department
>>> Illinois Institute of Technology
>>>
>>
>>
>>
>> --
>> Giacomo Fiorin
>> Assistant Professor of Research
>> Institute for Computational Molecular Science (ICMS)
>> College of Science and Technology, Temple University
>> 1925 North 12th Street (035-07), Room 704D
>> Philadelphia, PA 19122-1801
>> Phone: +1-215-204-4213
>> https://icms.cst.temple.edu/members.html
>> http://giacomofiorin.github.io/
>>
>>
>
>
> --
> Amy Rice
> Ph.D. Student
> Physics Department
> Illinois Institute of Technology
>

-- 
Giacomo Fiorin
Assistant Professor of Research
Institute for Computational Molecular Science (ICMS)
College of Science and Technology, Temple University
1925 North 12th Street (035-07), Room 704D
Philadelphia, PA 19122-1801
Phone: +1-215-204-4213
https://icms.cst.temple.edu/members.html
http://giacomofiorin.github.io/

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:21:46 CST