From: Brian Radak (brian.radak_at_gmail.com)
Date: Mon Mar 26 2018 - 11:57:28 CDT

Hi again Francesco,

It is definitely the intent of the developers to provide a straightforward
FEP platform that is usable by experts and non-experts alike - your
feedback will help us better achieve that, so thank you. I certainly had
some of the same issues you describe even before joining the development
team

You should definitely be able to divide the lambda range into multiple
segments and straightforwardly concatenate them in order for use with
ParseFEP (just as you described).

I just tested the following with the latest versions of ParseFEP and NAMD
(but the 2.12 release should work the same):

runFEP 0.0 0.5 0.25 <numsteps> ; # produces forw1.fep
runFEP 0.5 1.0 0.25 <numsteps> ; # produces forw2.fep
cat forw1.fep > forw.fep && cat forw2.fep >> forw.fep

runFEP 1.0 0.5 0.25 <numsteps> ; # produces back1.fep
runFEP 0.5 0.0 0.25 <numsteps> ; # produces back1.fep
cat back1.fep > back.fep && cat back2.fep >> back.fep

and then loaded forw.fep and back.fep into ParseFEP.

This still does not, to my knowledge, permit restarts, but it will let you
run longer per lambda and make better use of parallel resources when faced
with limited wall times.

HTH,
BKR

On Sun, Mar 25, 2018 at 5:24 AM, Francesco Pietra <chiendarret_at_gmail.com>
wrote:

> In retrospect, I must admit that my question "why 50 lambda windows were
> not enough to reach quasi convergence for the Unbound case?" was a silly
> question.Reaching quasi convergence is a serious problem.
>
> I have now further tried with the Unbound by increasing both
> pre-equilibration and FEP steps, while shortening the number of windows in
> order to keep the calculations within the allowed 24hr at the cluster (more
> that one node of 36 cores is attended by poorer performance). I got better
> results for free energy: red line at -5.5, black line at -10.0. Again, it
> could be noticed that the system answers positively to more chances toward
> convergence.
>
> In my view, all that shows that present analysis tools do not allow
> carrying out FEP simulations with protein-ligand complexes of current real
> interest. Chance should be given of carrying out FEP simulations in steps,
> say lamda 0.0-0.5 0.5-1.0, or,very likely, even more fractionated. Finally
> concatenating the .fepout files. Unless one has unlimited time of execution
> at the cluster, which is not my case, but would be risky anyway of easily
> loosing everything.
>
> One could object that "organic" ligands pose problems of compatibility
> with protein FF. However, I noticed that quasi convergence was reported by
> merely parameterizing the organic ligands with GAFF FF at semiempirical
> level (Chem. Sci., 2016, 7, 207). I believe that by fitting dihedrals and
> water interaction for charmm at HF-6-31G* level, I did better.
>
> I appreciated very much receiving by Brian "You can indeed, under certain
> circumstances, just concatenate output from multiple runs, but things must
> be in the right format for ParseFEP to detect it. That means you would have
> to sort and interleave the data at the same lambda value, probably strip
> out some of the comments, and then re-concatenate everything into a single
> file."
> I am prone to undertake such a job, because the alternative is abandoning
> the project for which I received a grant (what never happened to me
> before). However, I would need some more detailed instructions, perhaps a
> sketch of execution, or what else.
>
> Thanks
>
> francesco
>
>
> ---------- Forwarded message ----------
> From: Francesco Pietra <chiendarret_at_gmail.com>
> Date: Fri, Mar 23, 2018 at 9:31 PM
> Subject: Re: vmd-l: Fwd: ParseFEP for restating FEP
> To: Brian Radak <brian.radak_at_gmail.com>
> Cc: NAMD <namd-l_at_ks.uiuc.edu>, VMD Mailing List <vmd-l_at_ks.uiuc.edu>
>
>
> Hi Brian:
>
> (1)
>
> running each lambda as its own NAMD job,
>>
>
> As the very reason for that is to have more lambda windows than 24hr of
> cluster would allow with a standard execution, that strategy is simple
> impossible in the organization of the cluster I have access to. No short
> queues.
>
>
> (2)
>
> concatenate output from multiple runs, but things must be in the right
>> format for ParseFEP to detect it.
>>
>
> namd has a quite multiform audience, comprising also people that, like
> myself, have an experimental formation (biochemistry), while short in
> programming (and short of time to learn programming at a level for these
> affairs). Could namd consider to devote some time (probably no long time in
> the hands of experts) to allow semi-automatic concatenation of ,fepout
> outputs from FEP simulations as currently carried out?
>
>
> (3)
>
> Another, probably much more serious problem, is why 50 lambda windows were
> not enough to reach quasi convergence for the Unbound case? The ligand is
> not small and quite complex, like diterpenoids are. However, I spent
> several weeks in parameterizing it with dihedral and water-interaction
> fitting. Both the ligand alone, and the complex with the protein, allow
> hundreds of ns of MD without any problem. I am aware that the FEP tutorial
> for ligand-protein was for a peptide ligand, declaring the wish to avoid
> the problems with "natural products". However, should the latter be not
> FEP- workable, a whole area of interest (think to pharmaceutical interests)
> would be out.
>
>
> I hope that namd team will take (2) above into serious consideration,
> without having to wait for namd 2.13.
>
> Thanks a lot for all your advice.
>
> francesco
>
>
>
>
> On Fri, Mar 23, 2018 at 4:48 PM, Brian Radak <brian.radak_at_gmail.com>
> wrote:
>
>> Unfortunately there is not currently an elegant solution for this. You
>> can indeed, under certain circumstances, just concatenate output from
>> multiple runs, but things must be in the right format for ParseFEP to
>> detect it. That means you would have to sort and interleave the data at the
>> same lambda value, probably strip out some of the comments, and then
>> re-concatenate everything into a single file.
>>
>> If you are running each lambda as its own NAMD job, then this should not
>> be too hard - just "grep -v #" the fepout files after the first one:
>>
>> Here's a bash example for N jobs (N-1 restarts) and M lambda values --
>> I've assumed a specific file naming scheme that should be pretty
>> transparent and mappable onto whatever you've done.
>>
>> rm -f all_lambda.fepout 2> /dev/null
>> for ((m=0; m<$M; m++))
>> do
>> cat lambda${m}_job1.fepout >> all_lambda.fepout
>> for ((n=2; n<=$N; n++))
>> do
>> grep -v "#" lambda${m}_job${n} >> all_lambda.fepout
>> done
>> done
>>
>> all_lambda.fepout *should* work in ParseFEP, but I've never actually
>> tried something like this.
>>
>> I know this is not pretty and we're trying to improve things for NAMD
>> 2.13, especially with regards to automating parallel runs.
>>
>> Brian
>>
>>
>>
>> On Fri, Mar 23, 2018 at 10:00 AM, Francesco Pietra <chiendarret_at_gmail.com
>> > wrote:
>>
>>> Otherwise, is it possible to run portions of lambda and then concatenate
>>> the .fepout results?
>>> ---------- Forwarded message ----------
>>> From: Francesco Pietra <chiendarret_at_gmail.com>
>>> Date: Fri, Mar 23, 2018 at 8:05 AM
>>> Subject: ParseFEP for restating FEP
>>> To: NAMD <namd-l_at_ks.uiuc.edu>, VMD Mailing List <vmd-l_at_ks.uiuc.edu>
>>>
>>>
>>> Hello:
>>> May I ask whether there is any plan to make ParseFEP plugin capable of
>>> dealing with restarted FEP.
>>>
>>> With receptor-ligand, I found it difficult to get matching free energy
>>> for frwd/back in the 24hr hr allowed on the cluster, taking into account
>>> that I used the max number of nodes for the given size of the system.
>>>
>>> Thus, for the UNBOUND simulations, with either SOS or BAR estimator,
>>> Probability Distribution was found to improve from 1 to 4 until good
>>> overlapping, while Free Energy also improves correspondingly, albeit not
>>> doing better than red at -2.50, black at -10.0. By restarting until 100
>>> windows, probably I should have acceptable convergence. Curiously
>>> Enthalpy/Entropy match better (artifact).
>>>
>>> I used fifty windows with 50,000 pre-equilibration and 300,000 FEP only
>>> (1 node), in order to stay within the 24hr. (a smaller number of windows,
>>> even with much pre-equilibration and FEP, perform worser).
>>> With the ligand-protein I could use a max of 4 nodes (beyond which no
>>> higher speed) with perhaps a max of 10 windows. Bad prospects for
>>> convergence.
>>>
>>> Thanks for your attention
>>>
>>> francesco pietra
>>>
>>>
>>
>
>