From: Ashar Malik (asharjm_at_gmail.com)
Date: Mon Jul 20 2020 - 16:06:03 CDT

Assuming all PDBs you want to loop over have a .pdb extension

set filelist [glob *.pdb]
set und _
foreach i $filelist {
        set PDBID [lindex [split $i .] 0]
        mol load pdb $i
        set all [atomselect top all]
        set chains [lsort -dict -unique [$all get chain]]
        foreach j $chains {
                set sel [atomselect top "protein and chain $j and same
residue as within 5 of resname LIG"]
                # You may have many HET codes for many ligands present in
you structure - and may have to have another nested loop to iterate over
all ligands HET codes.
                $sel writepdb $PDBID$und$j.pdb
                delete $sel
        }
        delete $all
}

glob should make a list of file names in filelist which you can then
iterate over using "foreach"
Within each PDB there might be 1 or more chains - so find them by making a
selection of everything and then from that calling chain and storing that
list in the chains variable using: set chains [lsort -dict -unique
[[atomselect top all] get chain]]

Then you can iterate over each chain ...
If you already know the hetcode ID for ligand - you can use that in set sel
[atomselect top "protein and chain $j and same residue as within 5 of
resname LIG"]

but if you don't you can find those out using ...

set HET [lsort -dict -unique [[atomselect top all] get resname]]

The HET list will have a list of HET codes --- ~20 of them would be
standard amino acids - you might get a few modified amino acids like MSE
etc and the rest can be assumed to be HET codes for ligands.

Another way to get HET codes of ligands is by using a shell command
something like

grep for the pattern "^HETATM" over all PDBs -- this output can then be
parsed by python or tcl and a file can be written which can be read by
the vmd TCL script to quickly work out the ligands codes for each chain ...

I see that you are on windows so you can either get a shell emulator or use
the ubuntu shell that now comes bundled in windows or go with the earlier
option of finding the HET codes of non-amino acids - which will require 1
line I write above and some post processing to shorten that list to contain
only ligands HET codes and not amino acid HET codes.

Unless you have infinite memory - don't forget to delete the selections you
have made.

The output would not be:
> Receptor1.pdb --> Receptor1 Partial Pocket.pdb
> Receptor2.pdb --> Receptor2 Partial Pocket.pdb
> Receptor3.pdb --> Receptor3 Partial Pocket.pdb

but instead would be (Assuming 2 PDBs Receptor1 and Receptor2 with
Receptor1 having 2 chains A and B both with 1 ligand only and Receptor2
having only one chain A with one ligand only)

> Receptor1.pdb --> Receptor1 Chain A Partial Pocket.pdb
> Receptor1.pdb --> Receptor1 Chain B Partial Pocket.pdb
> Receptor2.pdb --> Receptor3 Chain A Partial Pocket.pdb

or if you add another nested loop it will have to be something like
(Assuming 2 PDBs Receptor1 and Receptor2 with Receptor1 having 2 chains A
and B both with chain A having 2 ligands LA and LB and chain B having only
one ligand LA and Receptor2 having only one chain A with one ligand LC
only)

> Receptor1.pdb --> Receptor1 Chain A Ligand LA Partial Pocket.pdb
> Receptor1.pdb --> Receptor1 Chain A Ligand LB Partial Pocket.pdb
> Receptor1.pdb --> Receptor1 Chain B Ligand LA Partial Pocket.pdb
> Receptor2.pdb --> Receptor2 Chain A Ligand LC Partial Pocket.pdb
(You would have to put the code line in for this - if you LIG is not your
only ligand and you have 1 or more ligands per chain - but it should be
straightforward as you can use the rest of the code as a template to
construct this loop).

Hopefully you can work out the rest from here. If you have difficulty write
back.

On Tue, Jul 21, 2020 at 12:41 AM William Howe <howew_at_mail.gvsu.edu> wrote:
>
> Hi, I'm new to TCL language, but have some experience with Python. I
> am wondering what would the TCL script be to iterate over a directory
> containing PDB files and create a binding pocket for each file
> iterated over.
>
> I know in VMD Graphical Representation I could use the atomselect
> command 'protein and within 4 of resname LIG', and then save
> coordinates to create a pdb file of the pocket but I would like to
> automate this process for several dozen pdb files I am working with.
>
> Say I have a Folder of PDB Files, here is the input and the desired
> output(output name should be in the same format):
> Receptor1.pdb --> Receptor1 Partial Pocket.pdb
> Receptor2.pdb --> Receptor2 Partial Pocket.pdb
> Receptor3.pdb --> Receptor3 Partial Pocket.pdb
>
> Heres the directory I'm Working From:
> C:\Users\Will\Desktop\School Junk\GABA Project\GABA Structures
>
> Within this folder are several subfolders and each subfolder has
> several other subfolders.
>
>
> Sorry for the long post, this is my first one, any help is greatly
appreciated.
>

--
Best,
/A