From: L. Michel Espinoza-Fonseca (mef_at_ddt.biochem.umn.edu)
Date: Tue Nov 20 2007 - 18:39:03 CST

John,

Thank you very much for the explanation, it was very enlightening!

Cheers,
Michel

2007/11/21, John Stone <johns_at_ks.uiuc.edu>:
>
> Michel,
> To answer your question, in short, VMD has evolved tremendously from
> where it started...
> The main reason why VMD uses physical memory for timesteps is because it
> was originally more of a visualization tool and less of an analysis tool,
> specifically it was originally developed to run in the CAVE,
> an immersive VR environment that requires screen redraws of 20-30 fps in
> order to be usable at all. In that type of environment, it's typically
> not practical to load things from disk during visualization unless you have
> a huge RAID array and use separate I/O threads for read-ahead while the
> main thread is doing visualization.
>
> For the purposes of desktop visualization, loading timesteps from disk
> on-the-fly is typically anoyyingly slow for anything but batch movie
> making, unless you have a very small simulation. That's the reason
> why it was originally written to load timesteps directly into memory.
> As the program has become much more powerful on the analysis side,
> we've progressively made steps to make it possible to run batch
> analyses, and added significant functionality in support of
> out-of-core scripting etc like BigDCD.
>
> The next step is to teach the VMD internals and the molfile plugins to
> directly support the use of arbitrarily large out-of-core data and use
> program-managed I/O to load and evict timesteps as necessary. The
> script based approach we've used for the last five years worked well for
> the types of analyses people did in VMD up to this point, but with
> the large number of plugins and extensions that VMD now enjoys,
> it makes much more sense to do this sort of thing in VMD itself
> and not require script and/or plugin writers to have to worry
> about this in most cases.
>
> Cheers,
> John Stone
> vmd_at_ks.uiuc.edu
>
> On Wed, Nov 21, 2007 at 01:03:25AM +0100, L. Michel Espinoza-Fonseca wrote:
> > Hi John,
> >
> > That would be a great improvement in VMD. I know it sounds a bit
> > ignorant, but I always wondered why VMD uses the physical memory to
> > store the trajectories instead of using the hard disk to temporarily
> > store them, for example. I also have huge trajectories (like 10 GB)
> > and indeed, I always need to use the big machines to perform the
> > analysis and/or to reduce the size of the files by cutting off the
> > parts of the system I'm interested in. (Un)fortunately it seems that
> > we're reaching a new point in which we can easily get tens of
> > nanoseconds of systems with more than 200K atoms. That creates a lot
> > of trouble when analyzing the huge trajectories created by NAMD :).
> >
> > Cheers,
> > Michel
> >
> > 2007/11/21, John Stone <johns_at_ks.uiuc.edu>:
> > >
> > > Hi Marcos, Oliver,
> > > While inconvenient due to the way the authors of PMEPot and VolMap
> > > wrote their code, it can still be done using BigDCD by changing the
> > > BigDCD script to load batches of frames before triggering an execution
> > > of VolMap or PMEPot. In order to workaround the limitation of these two
> > > codes, you'd have to averaging of the batches in your own script as
> > > neither of these two tools know how to allow the user to "continue"
> > > a partial calculation. This is something that would be best solved in
> > > a future rev of VMD by fixing both PMEPot and VolMap to allow an
> > > existing calculation to be done in stages, or to allow continuation
> > > by incorporating more frames, etc.
> > >
> > > I've already been planning to change the internals of VMD to allow
> > > out-of-core data processing for huge datasets that can't possibly fit
> > > into the physical memory of the host machine, but it will probably be
> > > at least a couple more months before I have time to work on that seriously
> > > due to various ongoing efforts with NAMD and other projects.
> > >
> > > When I implement that feature, it will (almost) entirely eliminate the
> > > need for scripts like BigDCD to be used at all, as VMD will do this
> > > automatically.
> > >
> > > Cheers,
> > > John Stone
> > > vmd_at_ks.uiuc.edu
> > >
> > > On Tue, Nov 20, 2007 at 02:47:21PM -0600, Marcos Sotomayor wrote:
> > > >
> > > > Hi John,
> > > >
> > > > I have had the same problem that Oliver mentioned. It would be indeed
> > > > great and very useful if one could analyze big trajectories without using
> > > > all the RAM of the most powerful computer in the lab...
> > > >
> > > > I know about and have used bigdcd before, but so far I don't see any easy
> > > > way to use it along with volmap and pmepot (Am I missing something?).
> > > >
> > > > Regards,
> > > > Marcos.
> > > >
> > > > ---------- Forwarded message ----------
> > > > Date: Tue, 20 Nov 2007 15:28:45 -0500
> > > > From: Oliver Beckstein <orbeckst_at_jhmi.edu>
> > > > To: vmd-l_at_ks.uiuc.edu
> > > > Subject: vmd-l: analysing big trajectories
> > > >
> > > > Hi,
> > > >
> > > > is there a way to analyse trajectories that are bigger than the available
> > > > RAM? For instance, I have trajectories > 5GiB in size that I would like to
> > > > analyze with VolMap but they can't be loaded because VMD insists on keeping
> > > > the whole trajectory in memory.
> > > >
> > > > A cumbersome work-around would be to split the trajectory into smaller
> > > > chunks, run volmap on each chunk, then average the resulting dx files.
> > > > However, I can think of situations when a simple average is not enough (for
> > > > instance for time correlation functions) and it would very convenient if
> > > > one could just have a (python-style) iterator over a trajectory (similar to
> > > > the 'for timestep in universe.dcd: ....' idiom in
> > > > http://code.google.com/p/mdanalysis/ ).
> > > >
> > > > (Note: I don't think that increasing swap space is a solution because that
> > > > leads to the computer almost grinding to halt when the trajectory is
> > > > loaded.)
> > > >
> > > > Thanks,
> > > > Oliver
> > > >
> > > > --
> > > > Oliver Beckstein * orbeckst_at_jhmi.edu
> > > >
> > > > Johns Hopkins University, School of Medicine
> > > > Dept. of Physiology, Biophysics 206
> > > > 725 N. Wolfe St
> > > > Baltimore, MD 21205, USA
> > > >
> > > > Tel.: +1 (410) 614-4435
> > >
> > > --
> > > NIH Resource for Macromolecular Modeling and Bioinformatics
> > > Beckman Institute for Advanced Science and Technology
> > > University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
> > > Email: johns_at_ks.uiuc.edu Phone: 217-244-3349
> > > WWW: http://www.ks.uiuc.edu/~johns/ Fax: 217-244-6078
> > >
>
> --
> NIH Resource for Macromolecular Modeling and Bioinformatics
> Beckman Institute for Advanced Science and Technology
> University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
> Email: johns_at_ks.uiuc.edu Phone: 217-244-3349
> WWW: http://www.ks.uiuc.edu/~johns/ Fax: 217-244-6078
>