From: Vermaas, Josh (vermaasj_at_msu.edu)
Date: Tue Jun 28 2022 - 16:03:06 CDT

Hi Christian,

Any reason you don’t want to use the clustering algorithm available from the measure command? https://www.ks.uiuc.edu/Research/vmd/current/ug/node138.html

If you have everything loaded up already, you can increase the number of desired clusters sequentially until every structure is classified.

set psel [atomselect top “protein and backbone”]
set numclusters 5
set clustering [measure cluster $psel num $numclusters distfunc fitrmsd cutoff 2.0]
while { [llength [lindex $clustering end]] > 0 } {
#Add more clusters.
set numclusters [expr {$numclusters + [llength [lindex $clustering end]] / [llength [lindex $clustering end-1]]}]
#Cluster again
set clustering [measure cluster $psel num $numclusters distfunc fitrmsd cutoff 2.0]
}

Otherwise, it looks like multiseq is using libbiokit under the hood to do its QH, clustering. In multiseq/multiseq.tcl (its in the plugins tree in a normal vmd installation), I think you are looking for the “getNonRedundantStructures” calls. libbiokit can also do QR factorization, but based on the code, I think it only does this for sequences.

-Josh

From: <owner-vmd-l_at_ks.uiuc.edu> on behalf of Christian Seitz <cseitz_at_ucsd.edu>
Date: Tuesday, June 28, 2022 at 4:12 PM
To: "vmd-l_at_ks.uiuc.edu" <vmd-l_at_ks.uiuc.edu>
Subject: vmd-l: MultiSeq QR structure factorization on the command line

Hello,

I am trying to use MultiSeq's structure QR factorization, to select structurally distinct protein conformations out of a trajectory. This can be done in the VMD GUI with a limited number of pdb files, but can it be done on the command line? I have very long trajectories (100,000 frames) and loading in all these frames to the MultiSeq GUI would take over a week at my current rate. Considering I have multiple systems, I'm looking for a faster way to do this. I see that years ago someone asked the same question (https://www.ks.uiuc.edu/Research/vmd/mailing_list/vmd-l/17207.html$>); the multiseq.tcl referenced in that answer is not included in the tutorial, and the tcl script that is included does not use factorization. Has there been any progress on making MultiSeq scriptable? Or is there a better way to accomplish this? Thanks for your help!

Best,
Christian

--
Christian Seitz
PhD Candidate, Biochemistry & Biophysics | UC-San Diego
NSF GRFP Fellow, Amgen Scholar
McCammon lab<
https://urldefense.com/v3/__https:/mccammon.ucsd.edu/__;!!DZ3fjg!-ZJB_XmNsDaOPPw1Sr90_AG-N_MKojEblBY58RChyd6avXFmS-qxJRoHQ9U-2whd8N6Ttdp88_Pv_VLnMw$> and Amaro lab<https://urldefense.com/v3/__https:/amarolab.ucsd.edu/__;!!DZ3fjg!-ZJB_XmNsDaOPPw1Sr90_AG-N_MKojEblBY58RChyd6avXFmS-qxJRoHQ9U-2whd8N6Ttdp88_OeZYC5NA$>
cseitz_at_ucsd.edu<mailto:cseitz_at_elon.edu>
[cid:~WRD0002.jpg]

<https://urldefense.com/v3/__http:/www.linkedin.com/in/christianseitz21__;!!DZ3fjg!-ZJB_XmNsDaOPPw1Sr90_AG-N_MKojEblBY58RChyd6avXFmS-qxJRoHQ9U-2whd8N6Ttdp88_O45H-x2Q$>