From: P.-L. Chau (pc104_at_pasteur.fr)
Date: Thu May 17 2012 - 03:19:50 CDT

Could I ask about using VMD to split a protein and associated HETATM into
different PDB files to be used by psfgen to generate files for NAMD
simulations, please?

I adapted the following script from section 2.1 of the psfgen user's guide
to split 2bg9.pdb into its consitituent subunits, and it worked well:

package require psfgen
mol load pdb test.pdb
set protein [atomselect top protein]
set chains [lsort -unique [$protein get pfrag]]
foreach chain $chains {
    set sel [atomselect top "pfrag $chain"]
    $sel writepdb nachrchol_${chain}.pdb
}
exit

I would like to extend this work. I obtained a PDB file from a colleague
where there are cholesterol molecules in addition to the 2bg9 protein. I
would like to split the file into the protein bits and cholesterol
molecules. I used this same script by executing "vmd -dispdev text -e" on
the protein+cholesterol system.

However, the procedure crashed with this result:

[...]
Info) Atoms: 15608
Info) Bonds: 16145
Info) Angles: 0 Dihedrals: 0 Impropers: 0 Cross-terms: 0
Info) Bondtypes: 0 Angletypes: 0 Dihedraltypes: 0 Impropertypes: 0
Info) Residues: 1895
Info) Waters: 0
Info) Segments: 1
Info) Fragments: 22 Protein: 14 Nucleic: 0
0
atomselect0
-1 0 1 10 11 12 13 2 3 4 5 6 7 8 9
ERROR) syntax error
atomselect: cannot parse selection text: pfrag -1
Info) VMD for MACOSXX86, version 1.9 (March 14, 2011)
Info) Exiting normally.

where a pfrag has been assigned number -1. I thought I would like to test
this script on a standard and simpler system, so I downloaded 3dfr.pdb,
which is dihydrofolate reductase with NADPH and methotrexate bound. I used
the same script to split the protein and the two HETATM molecules, and I
obtained this:

Info) Atoms: 1639
Info) Bonds: 1416
Info) Angles: 0 Dihedrals: 0 Impropers: 0 Cross-terms: 0
Info) Bondtypes: 0 Angletypes: 0 Dihedraltypes: 0 Impropertypes: 0
Info) Residues: 428
Info) Waters: 264
Info) Segments: 1
Info) Fragments: 267 Protein: 2 Nucleic: 0
0
atomselect0
0 1
Info) Opened coordinate file nachrchol_0.pdb for writing.
Info) Finished with coordinate file nachrchol_0.pdb.
Info) Opened coordinate file nachrchol_1.pdb for writing.
Info) Finished with coordinate file nachrchol_1.pdb.
Info) VMD for MACOSXX86, version 1.9 (March 14, 2011)
Info) Exiting normally.

Only two molecules were written out, when there should be three. I found
that nachrchol_0.pdb was methotrexate and nachrchol_1.pdb was the protein.
NADPH was nowhere to be found.

I also ran the protein+cholesterol system on the Tk console interactively,
and found that the stumbling block is this line:

  set chains [lsort -unique [$protein get pfrag]]

Could I ask if more experienced users would be able to advise me on how to
solve this problem, please?

Thank you very much!

P-L Chau

email: pc104_at_pasteur.fr
Bioinformatique Structurale
CNRS URA 2185
Institut Pasteur
75724 Paris
France
tel: +33 1 45688546
fax: +33 1 45688719