From: Ashar Malik (asharjm_at_gmail.com)
Date: Tue Feb 02 2021 - 19:22:51 CST

> I have a TCL script that creates a PDB file meant to list all protein
> residues within 4 angstroms of the ligand.
>

set filelist [glob *.pdb]
> set und " Partial Pocket"
> foreach i $filelist {
> set PDBID [lindex [split $i .] 0]
> mol load pdb $i
> set sel [atomselect top "protein and within 4 of resname
> UNK"]
> $sel writepdb $PDBID$und.pdb
> }

To list all atoms of residues and not just some atoms from each residue
within cutoff modify the selection

"protein and within 4 of resname UNK"

by including "same ? as"

see here for details.

https://www.ks.uiuc.edu/Research/vmd/vmd-1.3/ug/node142.html

> However, I noticed that the PDB files generated by VMD typically lists the
> protein atoms first and then the ligand atoms.
>

the input PDB that you give VMD has protein first and ligand listed later.
The output, I assume, is based on the index, if the index value of ligand
was earlier than the index value of protein, the output file will have the
arrangement you are looking for.

> Is there a way to generate a PDB file that lists the ligand atoms first,
> and then the protein atoms?
> What can I modify to my TCL script to accomplish this?
>

1. you can use "get" on your resname UNK selection to first get the index
values of all atoms in the resname UNK.
2. you can then use "get" on your binding pocket selection to get the index
values of all amino acid atoms in your selection.
3. Then use tcl commands to modify the list of index values obtained in
both cases such that the UNK atoms have index values smaller than protein
atoms.
4. You can then use "set" and a for loop to update the index values in
your selections to the values you have just calculated in the step 3.

Then write out.

This is how I would consider doing it, although I don't understand why you
would want to do this in the first place?

Also note that there is an underlying assumption in your code that there is
only 1 residue named UNK. You may get a scenario where there are two or
more UNK in different places around the protein, in which case you will get
two or more distinct sets of binding pockets merged into one file with no
way to tell them apart, except maybe using the distance criterion again. To
avoid this, you should add the specific resid of the UNK to your selection
so that it selects only 1 UNK at a time in case more than 1 are present and
change the name of the output file according to reflect that.

Note: I haven't done this myself. This is just a guess. You can try this on
a small test case PDB with say 2 protein residues listed on top and then 1
ligand residue listed at the bottom and see if it works.