From: John Stone (johns_at_ks.uiuc.edu)
Date: Tue Nov 20 2007 - 18:32:23 CST

Michel,
  To answer your question, in short, VMD has evolved tremendously from
where it started...
  The main reason why VMD uses physical memory for timesteps is because it
was originally more of a visualization tool and less of an analysis tool,
specifically it was originally developed to run in the CAVE,
an immersive VR environment that requires screen redraws of 20-30 fps in
order to be usable at all. In that type of environment, it's typically
not practical to load things from disk during visualization unless you have
a huge RAID array and use separate I/O threads for read-ahead while the
main thread is doing visualization.

For the purposes of desktop visualization, loading timesteps from disk
on-the-fly is typically anoyyingly slow for anything but batch movie
making, unless you have a very small simulation. That's the reason
why it was originally written to load timesteps directly into memory.
As the program has become much more powerful on the analysis side,
we've progressively made steps to make it possible to run batch
analyses, and added significant functionality in support of
out-of-core scripting etc like BigDCD.

The next step is to teach the VMD internals and the molfile plugins to
directly support the use of arbitrarily large out-of-core data and use
program-managed I/O to load and evict timesteps as necessary. The
script based approach we've used for the last five years worked well for
the types of analyses people did in VMD up to this point, but with
the large number of plugins and extensions that VMD now enjoys,
it makes much more sense to do this sort of thing in VMD itself
and not require script and/or plugin writers to have to worry
about this in most cases.

Cheers,
  John Stone
  vmd_at_ks.uiuc.edu

On Wed, Nov 21, 2007 at 01:03:25AM +0100, L. Michel Espinoza-Fonseca wrote:
> Hi John,
>
> That would be a great improvement in VMD. I know it sounds a bit
> ignorant, but I always wondered why VMD uses the physical memory to
> store the trajectories instead of using the hard disk to temporarily
> store them, for example. I also have huge trajectories (like 10 GB)
> and indeed, I always need to use the big machines to perform the
> analysis and/or to reduce the size of the files by cutting off the
> parts of the system I'm interested in. (Un)fortunately it seems that
> we're reaching a new point in which we can easily get tens of
> nanoseconds of systems with more than 200K atoms. That creates a lot
> of trouble when analyzing the huge trajectories created by NAMD :).
>
> Cheers,
> Michel
>
> 2007/11/21, John Stone <johns_at_ks.uiuc.edu>:
> >
> > Hi Marcos, Oliver,
> > While inconvenient due to the way the authors of PMEPot and VolMap
> > wrote their code, it can still be done using BigDCD by changing the
> > BigDCD script to load batches of frames before triggering an execution
> > of VolMap or PMEPot. In order to workaround the limitation of these two
> > codes, you'd have to averaging of the batches in your own script as
> > neither of these two tools know how to allow the user to "continue"
> > a partial calculation. This is something that would be best solved in
> > a future rev of VMD by fixing both PMEPot and VolMap to allow an
> > existing calculation to be done in stages, or to allow continuation
> > by incorporating more frames, etc.
> >
> > I've already been planning to change the internals of VMD to allow
> > out-of-core data processing for huge datasets that can't possibly fit
> > into the physical memory of the host machine, but it will probably be
> > at least a couple more months before I have time to work on that seriously
> > due to various ongoing efforts with NAMD and other projects.
> >
> > When I implement that feature, it will (almost) entirely eliminate the
> > need for scripts like BigDCD to be used at all, as VMD will do this
> > automatically.
> >
> > Cheers,
> > John Stone
> > vmd_at_ks.uiuc.edu
> >
> > On Tue, Nov 20, 2007 at 02:47:21PM -0600, Marcos Sotomayor wrote:
> > >
> > > Hi John,
> > >
> > > I have had the same problem that Oliver mentioned. It would be indeed
> > > great and very useful if one could analyze big trajectories without using
> > > all the RAM of the most powerful computer in the lab...
> > >
> > > I know about and have used bigdcd before, but so far I don't see any easy
> > > way to use it along with volmap and pmepot (Am I missing something?).
> > >
> > > Regards,
> > > Marcos.
> > >
> > > ---------- Forwarded message ----------
> > > Date: Tue, 20 Nov 2007 15:28:45 -0500
> > > From: Oliver Beckstein <orbeckst_at_jhmi.edu>
> > > To: vmd-l_at_ks.uiuc.edu
> > > Subject: vmd-l: analysing big trajectories
> > >
> > > Hi,
> > >
> > > is there a way to analyse trajectories that are bigger than the available
> > > RAM? For instance, I have trajectories > 5GiB in size that I would like to
> > > analyze with VolMap but they can't be loaded because VMD insists on keeping
> > > the whole trajectory in memory.
> > >
> > > A cumbersome work-around would be to split the trajectory into smaller
> > > chunks, run volmap on each chunk, then average the resulting dx files.
> > > However, I can think of situations when a simple average is not enough (for
> > > instance for time correlation functions) and it would very convenient if
> > > one could just have a (python-style) iterator over a trajectory (similar to
> > > the 'for timestep in universe.dcd: ....' idiom in
> > > http://code.google.com/p/mdanalysis/ ).
> > >
> > > (Note: I don't think that increasing swap space is a solution because that
> > > leads to the computer almost grinding to halt when the trajectory is
> > > loaded.)
> > >
> > > Thanks,
> > > Oliver
> > >
> > > --
> > > Oliver Beckstein * orbeckst_at_jhmi.edu
> > >
> > > Johns Hopkins University, School of Medicine
> > > Dept. of Physiology, Biophysics 206
> > > 725 N. Wolfe St
> > > Baltimore, MD 21205, USA
> > >
> > > Tel.: +1 (410) 614-4435
> >
> > --
> > NIH Resource for Macromolecular Modeling and Bioinformatics
> > Beckman Institute for Advanced Science and Technology
> > University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
> > Email: johns_at_ks.uiuc.edu Phone: 217-244-3349
> > WWW: http://www.ks.uiuc.edu/~johns/ Fax: 217-244-6078
> >

-- 
NIH Resource for Macromolecular Modeling and Bioinformatics
Beckman Institute for Advanced Science and Technology
University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
Email: johns_at_ks.uiuc.edu                 Phone: 217-244-3349
  WWW: http://www.ks.uiuc.edu/~johns/      Fax: 217-244-6078