From: Brian Radak (brian.radak_at_gmail.com)
Date: Mon Mar 26 2018 - 16:13:19 CDT

There are a number of ways to do this - although perhaps the hasty approach
I gave was not particularly transparent.

"cat" just dumps the contents of a file and the redirect operators ">" and
">>" just point to where the data goes and indicate whether it ought to
overwrite or append, respectively. You can thus perform multiple append
operations (">>") and accomplish what you want without resorting to more
intermediate files. There is probably more subtlety to it, but the &&
notation is just an easy way to run multiple commands on the same terminal
line in sequence:

cat frwd01.fepout > frwd.fepout
cat frwd02.fepout >> frwd.fepout
cat frwd03.fepout >> frwd.fepout
cat frwd04.fepout >> frwd.fepout

Or, for N files numbered 0 to N-1:

cat frwd0.fepout > frwd.fepout
for((n=1; n<$N; n++))
do
  cat frwd$n.fepout >> frwd.fepout
done

Unfortunately, the GUI alternative to this would be to manually load all of
the files by clicking a button each time and I don't see many users wanting
to do that (especially those who are comfortable with scripting).

Cheers,
BKR

On Mon, Mar 26, 2018 at 4:57 PM, Francesco Pietra <chiendarret_at_gmail.com>
wrote:

> Hi Brian:
> Great!
>
> As my feeling is that with my system the lambda range should be divided
> into many segments,
> could you please check whether I expanded correctly (for the FEP purpose)
> your concatenation
> in case of four lambda segments
>
> 0.00 0.25
> 0.25 0.50
> 0.50 0.75
> 0.75 1.00
>
> cat frwd01.fepout > frwdA.fepout && cat frwd.02.fepout >> frwdA.fepout
> cat frwd03.fepout > frwdB.fepout && cat frwd.04.fepout >> frwdB.fepout
> cat frwdA.fepout > frwd.fepout && cat frwdB.fepout >> frwd.fepout
>
> Thanks
> francesco
>
>
> On Mon, Mar 26, 2018 at 6:57 PM, Brian Radak <brian.radak_at_gmail.com>
> wrote:
>
>> Hi again Francesco,
>>
>> It is definitely the intent of the developers to provide a
>> straightforward FEP platform that is usable by experts and non-experts
>> alike - your feedback will help us better achieve that, so thank you. I
>> certainly had some of the same issues you describe even before joining the
>> development team
>>
>> You should definitely be able to divide the lambda range into multiple
>> segments and straightforwardly concatenate them in order for use with
>> ParseFEP (just as you described).
>>
>> I just tested the following with the latest versions of ParseFEP and NAMD
>> (but the 2.12 release should work the same):
>>
>> runFEP 0.0 0.5 0.25 <numsteps> ; # produces forw1.fep
>> runFEP 0.5 1.0 0.25 <numsteps> ; # produces forw2.fep
>> cat forw1.fep > forw.fep && cat forw2.fep >> forw.fep
>>
>> runFEP 1.0 0.5 0.25 <numsteps> ; # produces back1.fep
>> runFEP 0.5 0.0 0.25 <numsteps> ; # produces back1.fep
>> cat back1.fep > back.fep && cat back2.fep >> back.fep
>>
>> and then loaded forw.fep and back.fep into ParseFEP.
>>
>> This still does not, to my knowledge, permit restarts, but it will let
>> you run longer per lambda and make better use of parallel resources when
>> faced with limited wall times.
>>
>> HTH,
>> BKR
>>
>>
>> On Sun, Mar 25, 2018 at 5:24 AM, Francesco Pietra <chiendarret_at_gmail.com>
>> wrote:
>>
>>> In retrospect, I must admit that my question "why 50 lambda windows were
>>> not enough to reach quasi convergence for the Unbound case?" was a silly
>>> question.Reaching quasi convergence is a serious problem.
>>>
>>> I have now further tried with the Unbound by increasing both
>>> pre-equilibration and FEP steps, while shortening the number of windows in
>>> order to keep the calculations within the allowed 24hr at the cluster (more
>>> that one node of 36 cores is attended by poorer performance). I got better
>>> results for free energy: red line at -5.5, black line at -10.0. Again, it
>>> could be noticed that the system answers positively to more chances toward
>>> convergence.
>>>
>>> In my view, all that shows that present analysis tools do not allow
>>> carrying out FEP simulations with protein-ligand complexes of current real
>>> interest. Chance should be given of carrying out FEP simulations in steps,
>>> say lamda 0.0-0.5 0.5-1.0, or,very likely, even more fractionated. Finally
>>> concatenating the .fepout files. Unless one has unlimited time of execution
>>> at the cluster, which is not my case, but would be risky anyway of easily
>>> loosing everything.
>>>
>>> One could object that "organic" ligands pose problems of compatibility
>>> with protein FF. However, I noticed that quasi convergence was reported by
>>> merely parameterizing the organic ligands with GAFF FF at semiempirical
>>> level (Chem. Sci., 2016, 7, 207). I believe that by fitting dihedrals and
>>> water interaction for charmm at HF-6-31G* level, I did better.
>>>
>>> I appreciated very much receiving by Brian "You can indeed, under
>>> certain circumstances, just concatenate output from multiple runs, but
>>> things must be in the right format for ParseFEP to detect it. That means
>>> you would have to sort and interleave the data at the same lambda value,
>>> probably strip out some of the comments, and then re-concatenate everything
>>> into a single file."
>>> I am prone to undertake such a job, because the alternative is
>>> abandoning the project for which I received a grant (what never happened to
>>> me before). However, I would need some more detailed instructions, perhaps
>>> a sketch of execution, or what else.
>>>
>>> Thanks
>>>
>>> francesco
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: Francesco Pietra <chiendarret_at_gmail.com>
>>> Date: Fri, Mar 23, 2018 at 9:31 PM
>>> Subject: Re: vmd-l: Fwd: ParseFEP for restating FEP
>>> To: Brian Radak <brian.radak_at_gmail.com>
>>> Cc: NAMD <namd-l_at_ks.uiuc.edu>, VMD Mailing List <vmd-l_at_ks.uiuc.edu>
>>>
>>>
>>> Hi Brian:
>>>
>>> (1)
>>>
>>> running each lambda as its own NAMD job,
>>>>
>>>
>>> As the very reason for that is to have more lambda windows than 24hr of
>>> cluster would allow with a standard execution, that strategy is simple
>>> impossible in the organization of the cluster I have access to. No short
>>> queues.
>>>
>>>
>>> (2)
>>>
>>> concatenate output from multiple runs, but things must be in the right
>>>> format for ParseFEP to detect it.
>>>>
>>>
>>> namd has a quite multiform audience, comprising also people that, like
>>> myself, have an experimental formation (biochemistry), while short in
>>> programming (and short of time to learn programming at a level for these
>>> affairs). Could namd consider to devote some time (probably no long time in
>>> the hands of experts) to allow semi-automatic concatenation of ,fepout
>>> outputs from FEP simulations as currently carried out?
>>>
>>>
>>> (3)
>>>
>>> Another, probably much more serious problem, is why 50 lambda windows
>>> were not enough to reach quasi convergence for the Unbound case? The ligand
>>> is not small and quite complex, like diterpenoids are. However, I spent
>>> several weeks in parameterizing it with dihedral and water-interaction
>>> fitting. Both the ligand alone, and the complex with the protein, allow
>>> hundreds of ns of MD without any problem. I am aware that the FEP tutorial
>>> for ligand-protein was for a peptide ligand, declaring the wish to avoid
>>> the problems with "natural products". However, should the latter be not
>>> FEP- workable, a whole area of interest (think to pharmaceutical interests)
>>> would be out.
>>>
>>>
>>> I hope that namd team will take (2) above into serious consideration,
>>> without having to wait for namd 2.13.
>>>
>>> Thanks a lot for all your advice.
>>>
>>> francesco
>>>
>>>
>>>
>>>
>>> On Fri, Mar 23, 2018 at 4:48 PM, Brian Radak <brian.radak_at_gmail.com>
>>> wrote:
>>>
>>>> Unfortunately there is not currently an elegant solution for this. You
>>>> can indeed, under certain circumstances, just concatenate output from
>>>> multiple runs, but things must be in the right format for ParseFEP to
>>>> detect it. That means you would have to sort and interleave the data at the
>>>> same lambda value, probably strip out some of the comments, and then
>>>> re-concatenate everything into a single file.
>>>>
>>>> If you are running each lambda as its own NAMD job, then this should
>>>> not be too hard - just "grep -v #" the fepout files after the first one:
>>>>
>>>> Here's a bash example for N jobs (N-1 restarts) and M lambda values --
>>>> I've assumed a specific file naming scheme that should be pretty
>>>> transparent and mappable onto whatever you've done.
>>>>
>>>> rm -f all_lambda.fepout 2> /dev/null
>>>> for ((m=0; m<$M; m++))
>>>> do
>>>> cat lambda${m}_job1.fepout >> all_lambda.fepout
>>>> for ((n=2; n<=$N; n++))
>>>> do
>>>> grep -v "#" lambda${m}_job${n} >> all_lambda.fepout
>>>> done
>>>> done
>>>>
>>>> all_lambda.fepout *should* work in ParseFEP, but I've never actually
>>>> tried something like this.
>>>>
>>>> I know this is not pretty and we're trying to improve things for NAMD
>>>> 2.13, especially with regards to automating parallel runs.
>>>>
>>>> Brian
>>>>
>>>>
>>>>
>>>> On Fri, Mar 23, 2018 at 10:00 AM, Francesco Pietra <
>>>> chiendarret_at_gmail.com> wrote:
>>>>
>>>>> Otherwise, is it possible to run portions of lambda and then
>>>>> concatenate the .fepout results?
>>>>> ---------- Forwarded message ----------
>>>>> From: Francesco Pietra <chiendarret_at_gmail.com>
>>>>> Date: Fri, Mar 23, 2018 at 8:05 AM
>>>>> Subject: ParseFEP for restating FEP
>>>>> To: NAMD <namd-l_at_ks.uiuc.edu>, VMD Mailing List <vmd-l_at_ks.uiuc.edu>
>>>>>
>>>>>
>>>>> Hello:
>>>>> May I ask whether there is any plan to make ParseFEP plugin capable of
>>>>> dealing with restarted FEP.
>>>>>
>>>>> With receptor-ligand, I found it difficult to get matching free energy
>>>>> for frwd/back in the 24hr hr allowed on the cluster, taking into account
>>>>> that I used the max number of nodes for the given size of the system.
>>>>>
>>>>> Thus, for the UNBOUND simulations, with either SOS or BAR estimator,
>>>>> Probability Distribution was found to improve from 1 to 4 until good
>>>>> overlapping, while Free Energy also improves correspondingly, albeit not
>>>>> doing better than red at -2.50, black at -10.0. By restarting until 100
>>>>> windows, probably I should have acceptable convergence. Curiously
>>>>> Enthalpy/Entropy match better (artifact).
>>>>>
>>>>> I used fifty windows with 50,000 pre-equilibration and 300,000 FEP
>>>>> only (1 node), in order to stay within the 24hr. (a smaller number of
>>>>> windows, even with much pre-equilibration and FEP, perform worser).
>>>>> With the ligand-protein I could use a max of 4 nodes (beyond which no
>>>>> higher speed) with perhaps a max of 10 windows. Bad prospects for
>>>>> convergence.
>>>>>
>>>>> Thanks for your attention
>>>>>
>>>>> francesco pietra
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>