From: Leonardo Palmieri (leopalmieri1_at_gmail.com)
Date: Wed Apr 27 2022 - 11:38:26 CDT

No argument against using BigDCD.

No argument against using catDCD to remove water molecules.

Both work very well.

Using Extensions, eg Timeline, is a straightforward, easy and graphic
way to do analysis. Just thinking another paradigm in using VMD
analysis capabilities. On demand loading of trajectory could also
enable on-the-fly analysis.

Obviously, I know that it's easy to say, but hard to code and implement.

Thanks!

2022-04-27 1:00 GMT-03:00, Geist, Norman <norman.geist_at_uni-greifswald.de>:
> In case most of your system is made of water and you are not
> interested in it  consider extracting a DCD without water using
> catdcd and indexfiles. Without solvent, trajectories get often
> comfortably small.
>
> Bests
> Norman Geist
>
>
>
> Am Dienstag, den 26-04-2022 um 22:41 schrieb John Stone:
>
>
>
> Hi,
>   So what is described there isn't what we ended up implementing.
> There is functionality along these lines in developmental parts of
> VMD now, but not in the general case yet for a few reasons.
> 1) I ended up going against mmap(), and instead we use another API
>    known as direct I/O, which is more portable among operating
> systems
>    rather than Unix-only.
>
> 2) To get the performance we want for out-of-core I/O (via any API),
>    it ultimately requires a more purpose-designed trajectory format,
>    which is what I ended up doing in the so-called ".js" file
> format,
>    an early version of which is described here:
>      https://urldefense.com/v3/__https://link.springer.com/chapter/10.1007/978-3-642-24031-7_1__;!!DZ3fjg!5GSaXeC-2ZAamiMkVbsR4Yr4kU2ng6ypfkVGSj5dawafWapKF8pW14i2mMMLRYqvZCeWouRhMulRubvTwcKogMg$
>
>
> 3) As we've implemented an increasing number of analytical and
>    visualization features with GPU acceleration, ensuring that we
>    had a way of supporting GPUs with out-of-core I/O became an
>    increasingly important consideration that was not met by any
>    existing approach.  There is now a prototype implementation in
>    VMD using a combination of 2) and 3) here, that can achieve
>    massive I/O rates (over 70GB/sec for example, to network
>    attached storage from a single DGX-2 compute node).  This
>    requires further specialization of the trajectory I/O code,
>    and I've done it for some early cases, but it needs to become
>    pervasive through VMD, and this is something that will be done
>    using modern C++ constructs that requires C++ >= 2014.
>    Again, here we will still need special file formats to do it
> well,
>    so at the outset, it will only be the ".js" format that is
> supported.
>
> For the time being, using DCD files or other legacy file formats,
> the bigDCD script or your own "for" loop script is going to be the
> best way to go because a lot of the readers for these existing
> trajectory formats can't do full random access as currently
> implemented,
> so to get reasonable performance they'll have to be processed
> "in-order"
> for best performance at present.
>
> If you have specific needs that require random access, let me know
> more
> of the details.  So far I haven't heard anything that would be an
> argument
> against using BigDCD or similar methods with simple scripting
> approaches.
>
> Best,
>   John
>
> On Tue, Apr 26, 2022 at 04:41:29PM -0300, Leonardo Palmieri wrote:
>> Well, I also found this:
>>
>> I think it was from you, John.
>>
>> From: John Stone (johns_at_ks.uiuc.edu)
>> Date: Fri Feb 15 2008 - 17:03:15 CST
>>
>> "...I'm also working on a future design change for the VMD internals
> that will
>> enable it to work with trajectories that are far larger than the
> amount
>> of physical memory in the machine through a new out-of-core
> trajectory
>> plugin API. I will likely implement this first using my own special
>> trajectory format and use mmap() and related kernel VM calls to
> allow
>> VMD to map monstrously huge MD trajectories into virtual memory.
>> The trick will be to add code for pre-fetching threads during
> trajectory
>> analysis and playback, and to give the OS kernel "hints" about which
>> timesteps need to be in-core and which ones can optionally be paged
> out.
>> Later on, I hope to have a more general implementation that can work
> with
>> any reasonable trajectory format (and without the need for mmap()),
> where
>> VMD will keep a working set of frames in-core, and will dynamically
>> load/free frames as analysis/visualization operations demand. This
> too
>> will attempt to use scout threads to prefetch frames on-the-fly
> before
>> they are needed so that the user "feels" like they were already in
> memory.
>> I don't have a timeline for these developments yet, I'll know more
> once
>> my experiments with my initial Unix-specific mmap() based
> implementation
>> have made significant progress."
>>
>> That's I'm talking about...
>>
>> 2022-04-26 16:35 GMT-03:00, Leonardo Palmieri :
>> > BigDCD and scripts works well, some problems sometimes...
>> >
>> > but the point is:
>> >
>> > I'm interested in use extensions from Extension > Analysis, in
> graphic
>> > mode, remotely accessing interactively a node in the computer
> where
>> > the trajectory is stored.
>> >
>> > I'm using compressed X11 forwarding to have the graphic VMD
> working
>> > remotely, but the memory available per node cannot store the
> entire
>> > trajectory and the VMD crashes when it run out of memory.
>> >
>> > That's the reason.
>> >
>> > Thanks!
>> >
>> > 2022-04-26 16:17 GMT-03:00, John Stone :
>> >> Can you tell us why the bigDCD script isn't a choice for you?
>> >>
>> >> Best,
>> >>   John
>> >>
>> >> On Tue, Apr 26, 2022 at 04:12:18PM -0300, Leonardo Palmieri
> wrote:
>> >>> Hi everybody,
>> >>>
>> >>> Is there a way to analyse a trajectory without loading the
> entire
>> >>> trajectory file in RAM's computer?
>> >>>
>> >>> I know that is possible to do choosing a sub set of frames or
> choosing
>> >>> a larger stride or scripting using BigDCD, but all of those is
> not a
>> >>> choice for me. Is there another way?
>> >>>
>> >>> Thanks in advance!
>> >>>
>> >>>
>> >>> --
>> >>> att
>> >>>
>> >>> Leonardo Palmieri
>> >>
>> >> --
>> >> NIH Center for Macromolecular Modeling and Bioinformatics
>> >> Beckman Institute for Advanced Science and Technology
>> >> University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
>> >> http://www.ks.uiuc.edu/~johns/           Phone:
> 217-244-3349
>> >> http://www.ks.uiuc.edu/Research/vmd/
>> >>
>> >
>> >
>> > --
>> > att
>> >
>> > Leonardo Palmieri
>> >
>> > Pai de gente
>> > Pai de planta
>> >
>>
>>
>> --
>> att
>>
>> Leonardo Palmieri
>>
>> Pai de gente
>> Pai de planta
>
> --
> NIH Center for Macromolecular Modeling and Bioinformatics
> Beckman Institute for Advanced Science and Technology
> University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
> http://www.ks.uiuc.edu/~johns/           Phone: 217-244-3349
> http://www.ks.uiuc.edu/Research/vmd/
>

-- 
att
Leonardo Palmieri
Pai de gente
Pai de planta