AutoPSF Technical Documentation

AutoPSF provides a streamlined interface for building complete structures for molecular dynamics simulations. To save work on the part of the user, many assumptions are made by the program to allow structure building with a minimum of user input. The procedures used by AutoPSF to determine how the structure should be built are detailed below.

Steps in an AutoPSF run

AutoPSF runs roughly the same sequence of procedures in GUI or command line mode; in GUI mode, though, there is an opportunity for user input after stages 1 and 3.

  1. Check for valid input files
  2. Run checkpsf to produce a valid composite topology file
  3. Split the input structure into separate chains using AutoPSF's built-in criteria
  4. Load the temporary topology file
  5. Apply AutoPSF's default aliases
  6. Properly alter the protonation state of histidine residues near metal ions
  7. Build the identified chains using psfgen
  8. Modify cysteine residues with disulfide patches or hydrogen removal patches if they are close enough to another cysteine or a metal ion
  9. Ensure that there are no unparameterized residues in the structure, or provide the user with steps to deal with them if there are
  10. Write the final output psf and pdb files

Details of the AutoPSF procedure

Checkpsf

Prior to building a structure, psfgen will run the internal program checkpsf on the input structure and topology files. This program first combines all of the input topology files into one (temporary) monolithic topology file, and then ensures that every residue present in the input structure is defined in the specified topology files. If there are residues missing, a set of dummy entries is added to the temporary topology file that will allow psfgen to run. These dummy entries are flagged so that these unparameterized fragments will be detected later.

Criteria for splitting chains

AutoPSF attempts to properly guess the chain boundaries in the input structure; note that this is critical both for distinguishing protein chains from each other (so that they are properly named and terminated) and for separating protein, nucleic acid, and other components of the system. The first residue of the structure (obviously) begins a new chain; after that point, chain breaks are placed at any position where one of the following occurs:

  • The segid of a residue is different from the segid of the previous residue
  • The previous residue was a protein residue (identified by one of the three letter designations of the 20 standard amino acids) and the current residue is not
  • The previous residue was a nucleic acid residue (identified by having resname GUA,CYT,THY,ADE, or URA) and the current residue is not
  • The current residue is protein or nucleic acid and the previous residue was not of the same type
  • The resid of the current residue is more than one greater than the resid of the previous residue

In addition, AutoPSF will assign start and end patches to each chain. Nucleic acid chains are assigned 5PHO and 3TER patches, protein chains are assigned NTER and CTER patches (with GLYP or PROP used as the first patch if the N terminal residue is glycine or proline, respectively), and other chains are assigned NONE for their first and last patches.

Although these steps will correctly predict the chain breakdown for many common proteins, they are not foolproof; most notably, the presence of nonstandard amino acids or bases, missing residues, or unusual numbering can lead to errors. It is generally preferrable to use the AutoPSF gui in any case where this might be an issue, as the gui provides an interface for editing the chain boundaries and patches set by AutoPSF.

Default Aliases

AutoPSF will apply a number of psfgen aliases to correct common clashes between PDB and CHARMM nomenclature. This is generally safe, although it could in theory cause a problem if you want to use a residue with a name that clashes with the PDB name for something fairly common. The alias lines applied by AutoPSF are as follows:

  pdbalias residue G GUA
  pdbalias residue C CYT
  pdbalias residue A ADE
  pdbalias residue T THY
  pdbalias residue U URA

  foreach bp { GUA CYT ADE THY URA } {
     pdbalias atom $bp "O5\*" O5'
     pdbalias atom $bp "C5\*" C5'
     pdbalias atom $bp "O4\*" O4'
     pdbalias atom $bp "C4\*" C4'
     pdbalias atom $bp "C3\*" C3'
     pdbalias atom $bp "O3\*" O3'
     pdbalias atom $bp "C2\*" C2'
     pdbalias atom $bp "O2\*" O2'
     pdbalias atom $bp "C1\*" C1'
  }

  pdbalias atom ILE CD1 CD
  pdbalias atom SER HG HG1
  pdbalias residue HIS HSD

# Heme aliases
  pdbalias residue HEM HEME
  pdbalias atom HEME "N A" NA
  pdbalias atom HEME "N B" NB
  pdbalias atom HEME "N C" NC
  pdbalias atom HEME "N D" ND

# Water aliases
  pdbalias residue HOH TIP3
  pdbalias atom TIP3 O OH2

# Ion aliases
  pdbalias residue K POT
  pdbalias atom K K POT
  pdbalias residue ICL CLA
  pdbalias atom ICL CL CLA
  pdbalias residue INA SOD
  pdbalias atom INA NA SOD
  pdbalias residue CA CAL
  pdbalias atom CA CA CAL

# Other aliases
  pdbalias atom LYS 1HZ HZ1
  pdbalias atom LYS 2HZ HZ2
  pdbalias atom LYS 3HZ HZ3

  pdbalias atom ARG 1HH1 HH11
  pdbalias atom ARG 2HH1 HH12
  pdbalias atom ARG 1HH2 HH21
  pdbalias atom ARG 2HH2 HH22

  pdbalias atom ASN 1HD2 HD21
  pdbalias atom ASN 2HD2 HD22

Treatment of Histidines

Histidine residues tend to require special treatment because of their ability to take on any of three protonation states (delta protonated, epsilon protonated, or the charged form with both nitrogens protonated). In the current version of AutoPSF, all histidines are assigned delta protonation, although an interface to change this will be added in the near future. Any histidines bonded to metal ions will have their protonation state adjusted to keep the protion away from the metal.

Treatment of Cysteines

As with histidine, cysteine residues require special treatment because they can form disulfide bonds with other cysteines, and may also lost their sulfur hydrogen if they are involved in ligation of a metal. AutoPSF will automatically form a disulfide bond between any cysteine pair where the sulfur atoms are within three angstroms of each other (this is accomplished by putting a DISU patch on these residues). In addition, any cysteine bonded to a metal will be deprotonated, since it is likely involved as a ligand.

Unparameterized residues

AutoPSF will assume that any residue not defined in the input topology files is unparameterized. When a structure containing such residue is built, AutoPSF will detect the unparameterized fragment and display a dialogue box detailing possible solutions.