Re: Trouble loading frames from large DCD files

From: Maxim Belkin (mbelkin_at_ks.uiuc.edu)
Date: Thu Dec 19 2013 - 16:40:27 CST

Hi Alex,

In your script you create atom selection (“All”) but don’t use it. This is where your problem comes from. Either don’t create it, or delete it when you don’t need it: $All delete

Maxim

On Dec 19, 2013, at 4:23 PM, Alex Utev (CMP) <A.Utev_at_uea.ac.uk> wrote:

> I'm having problems extracting the dihedral angles from a very very large (500M steps - a dcd file of about 50GB)
> .dcd file. I can't use the 'bigdcd' script because I am not running this on my personal machine but on a University Cluster.
>
> Essentially what the below script does is split the .dcd into a number of section, wherein the dihedrals from 1 section are output
> to the same .txt file (so below I have 1M steps per .txt file corresponding to a total of 500 .txt files).
>
> Each of the 1M steps are the further split into smaller sections of about 1K to lower the memory usage (so only 1K of frames are loaded at a time
> and then are deleted before the next 1K are loaded).
>
> Everything seems to be working fine but when the script gets to loading frames past ~50M it becomes so slow that it practically just freezes.
> I've double checked the script and it does indeed seem to only be loading 1K frames at a time and the "molecule delete all" command seems to be working.
> I don't know if this has something to do with VMD simply having a problem with such a vast number of frames or if perhaps I am somehow using the "waitfor all"
> command incorrectly?
>
> Any help would be much appreciated!
>
> The script I am using is below:
>
>
> dih_to_calc=500000000
> dih_per_file=1000000
> num_dih_files=$((dih_to_calc/dih_per_file))
> step_size=1000
>
> for ((idihfile=1; idihfile <=$num_dih_files; idihfile++))
>
> do
>
> DcdFile=protein_output1.dcd
> OutFile="Dih"$ifile"_"$idihfile".txt"
> startdih=$(((idihfile-1) * dih_per_file))
> enddih=$((startdih + dih_per_file))
>
> cat << eof > dih.tcl
>
> set stepsize $step_size
> set framestotal $enddih
> set VmdOut [open "$OutFile" w]
> mol addfile protein1.psf
>
> for {set startframe $startdih} {\$startframe < \$framestotal} {incr startframe \$stepsize} {
>
> set lastframe [expr \$startframe + \$stepsize - 1]
> mol new $DcdFile type dcd first \$startframe last \$lastframe step 1 waitfor all
> set NumFrm [molinfo top get numframes]
> set NumMol [mol list all]
> set All [atomselect top all]
>
> for {set ifrm 0 } {\$ifrm < \$NumFrm } {incr ifrm} {
> set Dih1 [measure dihed {${DihIndex[1]}} frame \$ifrm]
> set Dih2 [measure dihed {${DihIndex[2]}} frame \$ifrm]
> set Dih3 [measure dihed {${DihIndex[3]}} frame \$ifrm]
> set i [expr \$ifrm + 1 + \$startframe - $startdih]
> puts \$VmdOut "\$i \$Dih1 \$Dih2 \$Dih3"
> }
>
> mol delete all
>
> }
>
> quit
> eof
>
> 'vmd' -dispdev text -e dih.tcl > dih.log
>
> done

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:24:06 CST