From: Vermaas, Joshua (Joshua.Vermaas_at_nrel.gov)
Date: Fri Dec 13 2019 - 12:34:23 CST

Hi Catherine,

The numbers it gives you are the RMSF per atom, since RMSF is measured per atom. You can check by checking the length of the returned list with the number of atoms in the selection:

puts "[$sel num] [llength [measure rmsf $sel]] [molinfo top numframes]"

The first two numbers will match, and the third will be different, indicating that the rmsf returns a per-atom quantity. To get an overall RMSF, people tend to average these atomic quantities together, and I left that part out since I forgot that this wasn't automatic (its been a while!). Alpha versions of VMD 1.9.4 have "measure rmsfperresidue" commands that make this easy, but if you don't want to upgrade, what you'd do is to do the averaging yourself, or only select CA atoms. If you select the CA atoms, you need to be ok with the assumption that the CA atom can stand in for the whole residue. If you want to do a more thorough job, you'd use the "protein" selection instead of "protein and backbone", and do something like this:

set ref [atomselect top "protein and noh" frame 0]
set sel [atomselect top "protein and noh"]
set resids [$sel get resid]
set uniqueresids [lsort -unique -integer $resids]
for { set f 0 } { $f < [molinfo top get numframes] } { incr f } {
$sel frame $f
$sel move [measure fit $sel $ref]
}
foreach resid $uniqueresids {
    set ressel [atomselect top "protein and resid $resid"]
    puts "RMSF for residue $resid: [vecmean [measure rmsf $ressel]]"
    $ressel delete
}

-Josh

On 12/13/19 10:56 AM, crockett c.h. (chc2g16) wrote:
Hi Josh,

I have run the script and it’s given a lot of numbers- the RMSF per each frame I assume. Is there a way to modify this for each residue- I’m assuming it’s something to do with the for loop? Sorry I maybe should have been more specific- my apologies.
Maybe something like this- its very primitive but have tried to use the end of the RMSD_per_residue.tcl provided in the tutorial.

set mol [mol new 2gbu_wbn.psf type psf waitfor all]
mol addfile 2gbu_6.dcd type dcd step 100 waitfor all ; #Note the step 100 to only read in every hundredth frame
#Usually you pre-align the frames before calculating an RMSF.
set ref [atomselect top "protein and backbone and noh" frame 0]
set sel [atomselect top "protein and backbone and noh"]
for { set f 0 } { $f < [molinfo top get numframes] } { incr f } {
$sel frame $f
$sel move [measure fit $sel $ref]
}
#With everything aligned, now measure the RMSF.
#puts "This is the RMSF for the whole protein over [molinfo top get numframes] frames: [measure rmsf $sel]"
#loop over all frames in the trajectory
    for {set frame 0} {$frame < $num_steps} {incr frame} {
                puts "Calculating rmsd for frame $frame ..."
                # get the correct frame
                $compare frame $frame
        $all frame $frame
                # compute the transformation
                set trans_mat [measure fit $compare $reference]
                # do the alignment
                $all move $trans_mat

                # compute the RMSD
                #loop through all residues
                foreach r $res {
                    set ref [atomselect $mol "protein and resid $r and noh" frame 0]
                    set comp [atomselect $mol "protein and resid $r and noh" frame $frame]
                    set rmsd($r) [expr $rmsd($r) + [measure rmsd $comp $ref]]
                    $comp delete
                    $ref delete
                }
    }

    set ave 0
                foreach r $res {
                    set rmsd($r) [expr $rmsd($r)/$num_steps]
                    # print the RMSD
                    puts "RMSD of residue $r is $rmsd($r)"
                    puts $fil " $r \t $rmsd($r)"
                    set res_b [atomselect $mol "resid $r"]
            $res_b set user $rmsd($r)
            $res_b delete
                    set ave [expr $ave + $rmsd($r)]
                }

    set ave [expr $ave/[llength $res]]
    puts " Average rmsd per residue: $ave"
    close $fil
}

Honestly, cannot thank you enough!

Catherine

Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

________________________________
From: Vermaas, Joshua <Joshua.Vermaas_at_nrel.gov><mailto:Joshua.Vermaas_at_nrel.gov>
Sent: Friday, December 13, 2019 5:00:44 PM
To: crockett c.h. (chc2g16) <chc2g16_at_soton.ac.uk><mailto:chc2g16_at_soton.ac.uk>; vmd-l_at_ks.uiuc.edu<mailto:vmd-l_at_ks.uiuc.edu> <vmd-l_at_ks.uiuc.edu><mailto:vmd-l_at_ks.uiuc.edu>; jribeiro_at_ks.uiuc.edu<mailto:jribeiro_at_ks.uiuc.edu> <jribeiro_at_ks.uiuc.edu><mailto:jribeiro_at_ks.uiuc.edu>
Subject: RE: vmd-l: Running RMSF script on compute canada

Hi Catherine,

I think I missed the part that you were doing RMSF calculations. RMSF usually cannot be done in a bigdcd framework. Unlike RMSD, which looks at things frame by frame, RMSF is an average quantity taken relative to the average structure of the whole trajectory. Since bigdcd looks at things one frame at a time, RMSF measurements taken will be meaningless. What I'd do instead is only load in every hundredth frame, and take the RMSF over the 2100ish frames you'd be loading into VMD. This should fit into memory, and you won't need to use bigdcd. So the way I'd modify the script is as follows:

set mol [mol new 2gbu_wbn.psf type psf waitfor all]

mol addfile 2gbu_6.dcd type dcd step 100 waitfor all ; #Note the step 100 to only read in every hundredth frame

#Usually you pre-align the frames before calculating an RMSF.
set ref [atomselect top "protein and backbone and noh" frame 0]
set sel [atomselect top "protein and backbone and noh"]
for { set f 0 } { $f < [molinfo top get numframes] } { incr f } {
$sel frame $f
$sel move [measure fit $sel $ref]
}
#With everything aligned, now measure the RMSF.
puts "This is the RMSF for the whole protein over [molinfo top get numframes] frames: [measure rmsf $sel]"

-Josh

On 2019-12-12 20:21:58-07:00 owner-vmd-l_at_ks.uiuc.edu<mailto:owner-vmd-l_at_ks.uiuc.edu> wrote:

Thank you João. I seem to have fixed the rmsf error encountered before with your suggestion (which I hope I managed to implement correctly!). Now having issues with ‘invalid command name “file 6”’ – I have encountered this before but was unable to fix even with the help of Josh. Before, I just deleted the outfile variable as it would be printed in the slurm-*.out file from using sbatch on compute canada however not sure this will work since it is included in the for loop this time.

A second is ‘segmentation fault (core dumped)’ which I think is a memory issue which means somehow bigdcd is not working properly.

Any advice appreciated!

Many thanks,

Catherine

proc rmsf_resid { frame } {

        global outfile sel frame0 nf

        $sel move [measure fit $sel $frame0]

        puts "$frame: [measure rmsf $sel first 1 last 210021 step]"

puts $outfile "$frame [measure rmsf $sel first 1 last 210021 step]"

}

source bigdcd.tcl

set outfile [open rmsf.dat w]

set mol [mol new 2gbu_wbn.psf type psf waitfor all]

mol addfile 2gbu_6.dcd type dcd waitfor all

set nf [molinfo top get numframes]

set frame0 [atomselect top "protein and backbone and noh" frame 0]

set sel [atomselect top "protein and backbone and noh"]

$frame0 global

$sel global

$outfile global

for {set i 0} {$i < [$sel num]} {incr i} {

set rmsf [measure $sel first 1 last 210021 step]

  puts $outfile "[expr {$i+1}] [lindex $rmsf $i]"

}

bigdcd rmsf_resid 2gbu_6.dcd

bigdcd_wait

puts $outfile "[measure rmsf $sel first 1 last 210021 step]"

close $outfile

quit

Sent from Mail<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgo.microsoft.com%2Ffwlink%2F%3FLinkId%3D550986&data=01%7C01%7Cchc2g16%40soton.ac.uk%7Cc3539fd949384195993108d77fee0065%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=PByWVargUcP0zGPYVx3CTPUKGfOa7Ala6E6Dt2zYmy4%3D&reserved=0> for Windows 10

________________________________
From: Joao Ribeiro <jribeiro_at_ks.uiuc.edu><mailto:jribeiro_at_ks.uiuc.edu>
Sent: Thursday, December 12, 2019 10:18:58 PM
To: crockett c.h. (chc2g16) <chc2g16_at_soton.ac.uk><mailto:chc2g16_at_soton.ac.uk>; vmd-l_at_ks.uiuc.edu<mailto:vmd-l_at_ks.uiuc.edu> <vmd-l_at_ks.uiuc.edu><mailto:vmd-l_at_ks.uiuc.edu>
Subject: Re: vmd-l: Running RMSF script on compute canada

Hi Catherine,

The error message appears because the list rmsf was not defined before the for loop. Also, the sel variable inside the function rmsf_sel was never defined.

Best,

João

From: <owner-vmd-l_at_ks.uiuc.edu><mailto:owner-vmd-l_at_ks.uiuc.edu> on behalf of "crockett c.h. (chc2g16)" <chc2g16_at_soton.ac.uk><mailto:chc2g16_at_soton.ac.uk>
Date: Thursday, December 12, 2019 at 12:44 PM
To: "<vmd-l_at_ks.uiuc.edu><mailto:vmd-l_at_ks.uiuc.edu>" <vmd-l_at_ks.uiuc.edu><mailto:vmd-l_at_ks.uiuc.edu>
Subject: vmd-l: Running RMSF script on compute canada

Hi,

I’m trying to use a script (like the one below) to measure the RMSF (for all residues) of my dcd trajectory file. I know that there are lots of errors in this but the most concerning is the ‘can’t read “rmsf”: no such variable’ and then subsequently ‘measure rmsf: invalid syntax, no such keyword: atomselect0’ When this ‘measure rmsf’ command is done in the tcl window of VMD is there a source script which I would need to manually source in this case?

Many thanks, and any input appreciated as always,

Catherine

proc rmsf_resid { frame } {

    global outfile sel frame0 nf sel

    $sel mve [measure fit $sel $frame0]

    puts "$frame: [measure rmsf $sel first 1 last 210021 step]"

puts $outfile "$frame [measure rmsf $sel first 1 last 210021 step]"

}

source bigdcd.tcl

set outfile [open rmsf.dat w]

set mol [mol new 2gbu_wbn.psf type psf waitfor all]

mol addfile 2gbu_6.dcd type dcd waitfor all

set nf [molinfo top get numframes]

set frame0 [atomselect top "protein and backbone and noh" frame 0]

set sel [atomselect top "protein and backbone and noh"]

$frame0 global

$sel global

$outfile global

for {set i 0} {$i < [$sel num]} {incr i} {

   puts $outfile "[expr {$i+1}] [lindex $rmsf $i]"

bigdcd rmsf_resid 2gbu_6.dcd

bigdcd_wait

puts $outfile "[measure rmsf $sel first 1 last 210021 step]"

close $outfile

quit

Sent from Mail<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgo.microsoft.com%2Ffwlink%2F%3FLinkId%3D550986&data=01%7C01%7Cchc2g16%40soton.ac.uk%7Cc3539fd949384195993108d77fee0065%7C4a5378f929f44d3ebe89669d03ada9d8%7C0&sdata=PByWVargUcP0zGPYVx3CTPUKGfOa7Ala6E6Dt2zYmy4%3D&reserved=0> for Windows 10