• ## Outreach

From: Josh Vermaas (vermaas2_at_illinois.edu)
Date: Wed Apr 01 2015 - 15:51:35 CDT

Hi Charles,

One optimization I would make is to only consider contacts under some
threshold for analysis. "measure contacts 10 \$sel1 \$sel2" would get you
two lists per frame of things that are even remotely close together,
which is liable to be much faster than your current approach, at the
cost of no longer having "nearest" distances that are far away from one
another.

If you want a full matrix, another optimization would be to exploit the
fact that the matrix is symmetric to reduce the number measure bond
calls by 2 by saying that the ij'th element is the same as the ji'th
element. Alternatively, you could also decide that the distance between
the center of masses is much more interesting, which would save you alot
of permutations that you no longer need to check.

-Josh Vermaas

On 04/01/2015 02:42 PM, Charles McAnany wrote:
> Friends,
> I'm analyzing a set of trajectories based on nearest-atom contact
> maps: for each pair of residues (the x and y axes), what is the
> distance between the closest atoms on those residues? (hydrogens
> excluded, of course) I'm doing this because I'm interested in the
> side-chain interactions, so just looking at CA distances might miss
> something cool.
>
> My contact map has two parts: one part shows the closest two residues
> get during the entire trajectory, essentially representing what parts
> of the protein can interact. This is easy to calculate quickly by
> using 'measure bond' and specifying the entire trajectory for the
> frame range, then getting the minimum value of the returned list.
>
> The other part shows the average closest-atom distance over the
> trajectory, representing the parts of the protein that spend lots of
> time interacting.
>
> For the life of me, I can't figure out how to efficiently write the
> second analysis. The awkward part is that the outer loop needs to be
> the frame number, since I'm averaging a property of each frame, not of
> each atom. This means lots and lots of calls to 'measure bond' and
> painfully long analysis times.
>
> Here's the problem area of what I have so far:
> #Just a very large number, this will go away when compared with a
> real distance.
> set closestAtomsThisFrame 1E30
> foreach atom1 [\$residue1 list] {
> foreach atom2 [\$residue2 list] {
> set thisDist [measure bond [list \$atom1 \$atom2 ] frame
> if {\$thisDist < \$closestAtomsThisFrame} {
> set closestAtomsThisFrame \$thisDist
> }
> }
> }
> set averageContact [expr { \$averageContact +
> \$closestAtomsThisFrame / \$numFrames }]
> if {\$closestAtomsThisFrame < \$minContact} {
> set minContact \$closestAtomsThisFrame
> }
> }
>
> Does anyone have advice on getting tcl to perform better on this sort
> of analysis?
>
> Cheers,
> Charles.