From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Thu Jun 21 2018 - 22:18:58 CDT

if this is on a reasonably recent linux box, i'd look into mounting the
compressed archive as a (read-only, if needed) fuse file system.

e.g. for .zip format files this would work with:

mkdir /tmp/pdb-archive
fuse-zip -r $HOME/pdb-archive.zip /tmp/pdb-archive
now all files archived in $HOME/pdb-archive.zip can be accessed
transparently in /tmp/pdb-archive

to stop this:
fusermount -u /tmp/pdb-archive

no more complex trickery needed, only regular file system i/o
  axel.

 (BTW: in a similarly elegant fashion can you "mount" remote files over ssh
with fuse-sshfs)

On Thu, Jun 21, 2018 at 10:56 PM John Stone <johns_at_ks.uiuc.edu> wrote:

> Brian,
> The PDB molfile plugin in VMD ends up having to make multiple passes
> through
> the file in order to determine things like the final atom count, so it
> wouldn't
> be possible to "stream" it per se, although one could make a modification
> to the PDB plugin akin to what we did in the "webpdb" plugin that would
> effectively pull the PDB of interest out of the compressed archive.
> My feeling however is that all of these schemes will end up losing
> compared to a trivial brute force approach where one pulls all of
> the relevant PDB files out of the compressed archive at once, and
> then loads them all into VMD on-demand. I'm guessing that the size
> of the 1,000 PDB files is insignificant, and that there would be no
> real reason not to do it this way other than inelegance. I do think
> it would likely perform faster since the decompression and unarchiving
> step would run at much closer to peak performance than it would using any
> of the mechanisms I'm aware of for doing streaming, regardless of the
> details
> of the particular Tcl approach mentioned below.
>
> Best,
> John Stone
> vmd_at_ks.uiuc.edu
>
> On Fri, Jun 15, 2018 at 03:08:50PM +0000, Bennion, Brian wrote:
> > Hello,
> >
> > I have a couple thousand pdb files that I need to analyze with a
> script
> > that VMD calls.
> >
> > These files are however, bundled in a tar zip archive file.
> >
> > I may have missed something but tcl has a package tar that allows one
> to
> >
> > package require tar
> > set chan [open myfile.tar.gz]
> > zlib push gunzip $chan
> >
> > set data [::tar::get $chan Com_min.pdb -chan]
> >
> > now the pdb file is in a the data variable and not a file pointer if
> I am
> > correct.
> >
> > Can vmd "stream" pdb file data this way?
> >
> > Thanks for your thoughts.
> >
> > Brian Bennion
>
> --
> NIH Center for Macromolecular Modeling and Bioinformatics
> Beckman Institute for Advanced Science and Technology
> University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
> http://www.ks.uiuc.edu/~johns/ Phone: 217-244-3349
> http://www.ks.uiuc.edu/Research/vmd/
>
>

-- 
Dr. Axel Kohlmeyer  akohlmey_at_gmail.com  http://goo.gl/1wk0
College of Science & Technology, Temple University, Philadelphia PA, USA
International Centre for Theoretical Physics, Trieste. Italy.