From: John Stone (johns_at_ks.uiuc.edu)
Date: Fri Aug 08 2008 - 14:45:52 CDT

Hi,
  CatDCD can't append files at the present time. There's no reason
it couldn't be taught to do this, but we'd also need to modify the VMD
plugins (which CatDCD uses for all of its I/O) to do this as well,
adding new open/update interfaces, and adding all of the necessary
consistency checks so that you don't unintentionally concatenate
incompatible DCD files together (e.g. wrong endianisms, differing
atom counts, different parameters for fixed/unfixed atoms, etc...).
Adding append support would require a fair amount of work to do safely,
but you could hack something together that would get you by for the short
term without much trouble. Truthfully, for your own purposes, if you know
that all of your DCD files have the exact same format, you could use the
Unix "cat" and "dd" comands to do the append if you know
that all of your files have the same parameters. You'd just need to
skip the header when concatenating the files.

Just out of curiosity, is there a particular reason you need to assemble
your individual files into one large contiguous file? It would be pretty
easy to teach analysis scripts like BigDCD to chew through an arbitrary
list of files in sequence one after the other, without having to assemble
them in a single contiguous file.

Cheers,
  John Stone
  vmd_at_ks.uiuc.edu

On Fri, Aug 08, 2008 at 12:15:46PM -0700, Andrew Jewett wrote:
> >>On Jul 24, 2008, at 3:17 AM, pellegrini wrote:
> >> Is there a way or an option (I did not find it in the manual) to
> >> get a single dcd file of
> >> 1ns using the the first 500ps of my simulation that was interrupted
> >> and the 500 other ps
> >> coming from a restart ?
>
> >From: Eric H. Lee (ericlee_at_ks.uiuc.edu)
> >Date: Thu Jul 24 2008 - 08:58:21 CDT
> >Yes, you can use the tool catdcd to join dcd files.
> >http://www.ks.uiuc.edu/Development/MDTools/catdcd/
>
> This is a helpful answer, but it doesn't quite resolve our issues.
> I have a feature request for both catdcd and NAMD. (I hesitate to
> cross-post this question to the NAMD mailing list, but perhaps I
> should.)
>
> In our lab, we run thousands of short simulations, one after
> another. (There are several reasons we are forced to do this.) We
> don't know in advance how many there will be. After each one, we
> append the most recent trajectory to the end of the existing
> trajectory. We are using catdcd to take a pair of input trajectories
> and merge them into a new output trajectory. As the trajectory gets
> longer, the time required to do this gets longer, because each time,
> catdcd must create a new file and we delete the old trajectory. As
> the file gets larger, we end up spending more time waiting for catdcd
> than we do waiting for the simulation to run.
>
> I know there are smarter ways to use catdcd. (Catdcd can merge
> more than two files together. We could wait until the end, and use
> catdcd to merge all of the files together, but it's not a very pretty
> script. We don't know how many of them there will be, but there are
> typically more than 10000. Considering that we have multiple
> replicas, we have hundreds of thousands of them. This has brought
> down our file-system before.)
>
> How hard would it be to modify catdcd so that it appends trajectory
> data to existing files (instead of creating a new file)? If this is a
> feature that already exists in catdcd, please change the documentation
> to clarify this. (Assuming catdcd is written in C or C++, I'd be
> happy to try adding this feature. I was thinking of using "-a
> outputfile" instead of "-o outputfile". Is that okay? Not sure if
> I'm up for to digging into the NAMD source code, yet.)
>
> -cheers
> Andrew

-- 
NIH Resource for Macromolecular Modeling and Bioinformatics
Beckman Institute for Advanced Science and Technology
University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
Email: johns_at_ks.uiuc.edu                 Phone: 217-244-3349
  WWW: http://www.ks.uiuc.edu/~johns/      Fax: 217-244-6078