From: Josh Vermaas (vermaas2_at_illinois.edu)
Date: Tue Aug 19 2014 - 19:30:45 CDT

Hi Diego,

Basically, you need a loop, but you'll also need to be a bit careful,
because some of the operations can run you out of memory if you aren't
careful. Here is the basic flow:

set pdblist ... ;#This would be a command to put all 5200 pdbids into a
list that we'll iterate through. You probably have this in a file
somewhere, so lookup how to read in files into a list.

set fout [open "outputfile.dat" "w"]
#Now loop through all of them
foreach pdb $pdblist {
     mol new $pdb
     mol ssrecalc top; #Calculate the secondary structure.
     set sel [atomselect top "resname PRO and structure E"] ; #Make an
atomselection that will contain what you want. Prolines that have
structure E (a beta strand)
     if {[expr {[$sel num] > 0]} { ;#If statement only evaluates to true
if the atomselection isn't empty.
         puts $fout $pdb
     }
$sel delete ;#Delete the atomselection
     mol delete top ; #Delete the molecule when you are done
}
$fout close

Run it, and you should find a outputfile.dat in your current working
directory that has the list of pdbs that have a proline in a beta
strand. Or it might crash terribly when stride fails (which is what
assigns secondary structure). Your mileage may vary. :)

-Josh Vermaas

On 8/19/14, 7:21 PM, Diego Granados wrote:
> Hi! I'm trying to do something very basic but i'm clueless about it.
> At this moment i have a list a ~ 5200 proteins ( i have their PDB
> ID's) and i would like to know which of them have at least one proline
> in their beta sheets. I guess a little script in tcl could work but i
> can't figure how to start.
>
> Do you have any ideas about this? Do you think is reachable or do i
> need to try with another tool? Which one?
>
> Thank's a lot!
>
> Diego.