Re: Get number of timesteps from DCD via script

From: Bjoern Olausson (namdlist_at_googlemail.com)
Date: Wed Feb 09 2011 - 08:32:13 CST

On Wednesday 09 February 2011 14:43:17 Axel Kohlmeyer wrote:
> the situation of reading dcd is fairly complicated, as you could see
> from reading the dcd plugin code from VMD. there are essentially
> three issues to worry about:
>
> - there are different variants of the .dcd file format: coordinates,
> velocities, accelerations, and the coordinates can be with or
> without cell dimensions. also different codes output different
> properties in different locations, or just set them to some
> random or useless values.
>
> - the "width" of some of the entries, particularly the bytes that you call
> HDR, i.e. the record length marker, can change depending on what
> compiler and platform you were on. most fortran compilers use 32-bit
> record length markers, even on 64-bit platforms. some 64-bit, some
> have a flag to set this. also the size of all the integers can be
> doubled, if the fortran code outputting the .dcd file was compiled with
> 64-bit integer support.
>
> - finally it makes a difference whether you write binary data on a
> big endian or little endian machine (or rather if you are reading the
> data on a machine with the same or different endian byteordering)
>
> now, if you only care about files from one specific machine for you
> personally, you can just go along and forget the rest, but if you want
> to write something that is portable across multiple MD packages,
> i welcome you to the hell of arbitrary and unpractical file formats.
>

Thanks, that was informative!
For now I would be happy to figure out the proper header layout for NAMD 2.7
64-bit compiled with ICC 11.1.046 (ELF 64-bit LSB executable, AMD x86-64,
version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs),
for GNU/Linux 2.6.9) which only runs on one cluster.

> FWIW, most example codes that you can find will only read a subset
> of the available files. the VMD dcd plugin reads more variants than any
> other code that i know, but there is a lot of hard work and testing (mainly
> by john) that went into it and every once in a while some issue still comes
> up (although recently more to prove that some dcd format producing code
> is not writing proper files).
>
Thats really odd - and I was complaining about the PDB file format ;-)

Do you have any hint for me on how to figure out the stuff? I tried looking at
the header in a hex editor, but except from the values I already figured out,
I couldn't find more known values (like PBC, integration timestep, number of
Atoms) which brings up the question what information NAMD stores in the
header.

> the main problem is, there is no documented standard. the "standard"
> is whatever CHARMM outputs, and even that has been changing.
>
Mhhh, is there at least one for NAMD (2.7). Of course one would have to pay
attention to the endianess, bit width and precision. But if the bit width,
endianess and precision are known, the sequence for the values should be the
same for all DCDs written by NAMD, shouldn't it?

Thanks again for this information. This explains why my interpretation of some
Fortran code was not working and now that I know that I am not entirely stupid
it is less frustrating ;-)

Cheers,
Bjoern

-- 
Bjoern Olausson
Martin-Luther-Universität Halle-Wittenberg 
Fachbereich Biochemie/Biotechnologie
Kurt-Mothes-Str. 3
06120 Halle/Saale
Phone: +49-345-55-24942

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:56:37 CST