From: Tristan Croll (tristan.croll_at_qut.edu.au)
Date: Sun Feb 08 2015 - 01:57:33 CST

Hi all,

I've just finished making a bunch of modifications to AutoPSF (and squishing a few bugs along the way). Mostly for my own purposes, but I'm happy to share with anyone who's interested. While it's too big to share on-list, I've tried to keep a fairly comprehensive list of changes in the header (copied below). Please drop me a line if you'd like to try it out. I think it's all working correctly now, but

Cheers,

Tristan

## 1.41 Modifications by Tristan Croll, 8 Feb 2015: tristan.croll_at_qut.edu.au
## - Segment naming system modified so that chain names are maintained. The new scheme is
## {chain}{type}nn, where {type} can be P for protein, W for water, G for glycan,
## N for nucleic or O for other.
## - Will now correctly handle non-linearities in protein chain numbering
## (insertions/deletions). If the resID takes a non-unit step while
## working through a protein chain, an explicit geometric step will
## be carried out to determine if this is a real break.
## - Topologies updated to CHARMM-36. You will need to download the CHARMM-36 topology
## package from http://mackerell.umaryland.edu/charmm_ff.shtml, unpack to a
## directory of your choice, and add the following line either to your .vmdrc file
## or pkgIndex.tcl:
##
## set env(CHARMMFFDIR) "/path/to/CHARMM"
##
## Note: depending on your version of VMD and NAMD you may need to replace
## toppar_water_ions.str with the toppar_water_ions_namd.str available from the
## MacKerell lab.
##
## At present I load the following topology files:
## - top_all36_prot.rtf
## - top_all36_lipid.rtf
## - top_all36_na.rtf
## - top_all36_carb.rtf
## - top_all36_cgenff.rtf
## - stream/toppar_all36_carb_glycopeptide.str
## - toppar_water_ions.str
##
## This is sufficient to allow all the NBFIX terms at the end of the toppar_water_ions.str
## to be uncommented without leading to a fatal psfgen error. I highly recommend you do this.
## - Regenerate angles dihedrals (variable regenall) is now on by default. This is needed

## because CHARMM-36 no longer defines these in its patches.
## - Will now handle the basic N-linked glycans (NAG, MAN, BMA, FUC), and addition of other
## glycan residues should be straightforward. This was surprisingly challenging, since unlike
## protein chains, there is no requirement for glycan residues to be grouped in any logical
## arrangement (or even grouped at all!) in a PDB file. Since the main autopsf proc requires
## each segment to be a contiguous block of indices, I had to implement a new proc,
## ::autopsf::preformat_pdb, that breaks the input pdb file down into protein, water, ions,
## glycans and others, sorts and names the glycans into bonded groups, and writes a new input
## PDB file. All sugar residues that are bonded to each other should now fall into their own
## segment.
## Some caveats:
## - You will of course need to be using the CHARMM-36 or later forcefield,
## and have both top_all36_carb.rtf and toppar_all36_carb_glycopeptide
## in your topology list.
## - For the time being I have stuck with the traditional 4-character resname limit. This
## means some editing of top_all36_carb.rtf is needed since it's done away with this restriction.
## MAN, BMA and FUC residues are fine as-is (AMAN, BMAN and AFUC in the CHARMM scheme),
## but NAG has become BGLCNAC. I've renamed it to BGLN, which doesn't seem to conflict
## with anything.
## - You will need a new atomselect macro. You could define it here, but I find it far more
## useful to define it in the .vmdrc:
##
## atomselect macro glycan {resname NAG BGLN FUC AFUC MAN AMAN BMA BMAN}
##
## This will of course need to be expanded if you wish to add other glycan residues. Adding others
## will also require updating the pdbalias list in ::autopsf::psfaliases, the regexp entry in
## ::autopsf::split_protein_and_water_pdb, and entries in ::autopsf::make_glycan_patches. In
## the latter, you need to determine which oxygens sit in the alpha (axial) arrangement on your
## residue, and add your resname to the regexp list for those oxygens.
##
##
## - A few bugs have also been caught and fixed:
##
## - The ::autopsf::find_ssbonds proc had an error in its atomselection that would lead to it failing
## to generate disulfides between cysteine residues with the same resid on different chains.
## - Concatenation of the topology files into a temporary topology was incompatible with the use of
## multiple stream files. Since each stream file contains a return statement, everything past that point
## would become invisible. This was simply fixed by stripping out all return lines from the concatenated
## topology file.
## - In the situation where the input PDB file contains unrecognized residues leading to activation of
## the Autopsf Component chooser, clicking the "Rerun psfgen" button would always lead to the final
## structure being stripped back to only protein or nucleic acid residues. The culprit here is the
## ::autopsf::runpsfgen proc, which contains:
##
## if {$oseltext != ""} {
## set ofrag true
## set allfrag false
## }
##
## Since the AutoPSF GUI sets oseltext to "protein or nucleic" on startup, this would always
## be triggered. The simple workaround was to include
##
## if {!$ofrag} {
## set oseltext ""
## }
##
## in ::autopsf::rerunpsfgen just prior to calling runpsfgen.
## - Going through rerunpsfgen was also doubling up on all automatically generated patches
## (which would include disulfides, CYSD entries and now glycosidic linkages), leading to
## an immediate crash in NAMD when trying to run a simulation. This occurs because
## the set of patches defined by the first GUI run-through are defined in $patchlist (these
## of course have to be saved to carry through any user-defined patches). These are applied in
## ::autopsf::psfmain, but psfmain also has to be able to run the automatic patch generation
## for situations where it's run from the command line. To get around this I defined a new
## boolean variable, ispatched, which is false by default and set to true the first time
## patches are auto-generated.