From: Patrick Charchar (patrick.charchar_at_rmit.edu.au)
Date: Tue Jul 26 2016 - 23:01:01 CDT

Hello,
I have an analysis script that takes a very long time to run and was
wondering if anyone has some advice to make this more efficient/parallel?

Essentially, in bash I use the following command to load a pdb structure
and xtc trajectory (with approx 10,000 - 200,000 frames) into VMD and
execute my tcl script:

vmd -pdb ${pdb}.pbd -xtc ${xtc}.xtc -dispdev text < ${script}.tcl

The tcl script creates a large number of atomselections (in some cases up
to about 10,000) that are based on some user inputted criteria.

Then it loops through each frame of the trajectory, updates each
atomselection and outputs the number of atoms in that selection. Without
going into too much detail this is primarily done using the following loop:

set frames [molinfo top get numframes]
for {set i 0} {${i} < ${frames}} {incr i} {
  foreach name ${names} {
    foreach j $k {
      if {$i == 0} {
        set out [open ${name}_${j}.dat w]
      } else {
        set out [open ${name}_${j}.dat a]
      }
      while {$r <= $r_final}
        eval $${name}_${j}($r) frame ${i}
        eval $${name}_${j}($r) update
        puts -nonewline ${out} "[eval $${name}_${j}($r) num] "
      }
      puts $out ""
      close $out
    }
  }
}

I've left out setting some variables etc. but hopefully you get the gist.

So all up I have upto 10,000 selections x 100,000 frames so I am expecting
this to take a very longtime to run in serial but is there a simple way to
make this run faster without say running multiple instances of VMD with
less selections/frames and then combining the results at the end.

My end goal is to also output some other properties in the same script that
depend on the dynamic atom selections (such as measure h-bonds and measure
sasa).

Any advice is welcome.

Kind regards,
Patrick