Re: Get number of timesteps from DCD via script

From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Wed Feb 09 2011 - 07:43:17 CST

>
> Mhhh, as I sayed, I have noe clue abotu Fortran and it is a hard time reading
> the code and editing it. I understand the above code the following way:

bjoern,

the situation of reading dcd is fairly complicated, as you could see
from reading the dcd plugin code from VMD. there are essentially
three issues to worry about:

- there are different variants of the .dcd file format: coordinates,
  velocities, accelerations, and the coordinates can be with or
  without cell dimensions. also different codes output different
  properties in different locations, or just set them to some
  random or useless values.

- the "width" of some of the entries, particularly the bytes that you call
  HDR, i.e. the record length marker, can change depending on what
  compiler and platform you were on. most fortran compilers use 32-bit
  record length markers, even on 64-bit platforms. some 64-bit, some
  have a flag to set this. also the size of all the integers can be doubled,
  if the fortran code outputting the .dcd file was compiled with 64-bit
  integer support.

- finally it makes a difference whether you write binary data on a
  big endian or little endian machine (or rather if you are reading the
  data on a machine with the same or different endian byteordering)

now, if you only care about files from one specific machine for you
personally, you can just go along and forget the rest, but if you want
to write something that is portable across multiple MD packages,
i welcome you to the hell of arbitrary and unpractical file formats.

FWIW, most example codes that you can find will only read a subset
of the available files. the VMD dcd plugin reads more variants than any
other code that i know, but there is a lot of hard work and testing (mainly
by john) that went into it and every once in a while some issue still comes
up (although recently more to prove that some dcd format producing code
is not writing proper files).

the main problem is, there is no documented standard. the "standard"
is whatever CHARMM outputs, and even that has been changing.

cheers,
    axel.

>
> read(243) HDR,(ICNTRL(j),j=1,9),delta,(ICNTRL(j),j=11,20)
>
> Read 4 Chars --> COOR
> Read 1 int(4) 1--> # of configurations
> Read 1 int(4) 2--> time step of first configuration
> Read 1 int(4) 3--> frequency of configurations
> Read 1 int(4) 4--> last time step in dcd file
> Read 1 int(4) 5--> 0
> Read 1 int(4) 6--> 0
> Read 1 int(4) 7--> 0
> Read 1 int(4) 8--> 0
> Read 1 int(4) 9--> 0
> Read 1 real(4)1--> timestep of simulation
> Read 1 int(4) 11--> 0
> Read 1 int(4) 12--> 0
> Read 1 int(4) 13--> 0
> Read 1 int(4) 14--> 0
> Read 1 int(4) 15--> 0
> Read 1 int(4) 16--> 0
> Read 1 int(4) 17--> 0
> Read 1 int(4) 18--> 0
> Read 1 int(4) 19--> 0
> Read 1 int(4) 20--> 0
> READ 1 int(4) 1--> Length of title
> READ 1 char(80) --> title
> READ 1 int(4) --> Number of Atoms
>
>
> Where is the PBC here?
>
> This is, at least in my opinion, exactly what I am doing in my python script.
> Except that I am not sure how to treat the "real" type i python. They types
> available are listed here:
> http://docs.python.org/library/struct.html#format-characters
>
> Sorry to bug you with this, but it is frustrating without a proper
> documentation and not being able to easily read Fortran.
>
> Could anyone complement/correct the header structure:
> HDR NSET ISTART NSAVC ILAST ZEROS5 DELTA ZEROS9
> i4c i i i i 5i real 9i
>
> After the DELTA my script breaks and I am not sure if it id because of a wrong
> type or some other missing value.
>
> http://paste.pocoo.org/show/334948/
>
> #!/bin/env python
> # -*- coding: utf-8 -*-
> #
> #
>
> import sys, struct
>
> #Info: SIMULATION PARAMETERS:
> #Info: TIMESTEP               2
> #Info: NUMBER OF STEPS        1500000
> #Info: STEPS PER CYCLE        10
> #Info: PERIODIC CELL BASIS 1  93.1932 0 0
> #Info: PERIODIC CELL BASIS 2  0 93.1932 0
> #Info: PERIODIC CELL BASIS 3  0 0 143.704
> #Info: PERIODIC CELL CENTER   0 0 0
>
> f = open(sys.argv[1], 'rb')
> HDR = struct.unpack('i4c',f.read(struct.calcsize('i4c')))
> NSET = struct.unpack('i', f.read(struct.calcsize('i')))[0]
> ISTART = struct.unpack('i', f.read(struct.calcsize('i')))[0]
> NSAVC = struct.unpack('i', f.read(struct.calcsize('i')))[0]
> ILAST = struct.unpack('i', f.read(struct.calcsize('i')))
> ZEROS5 = struct.unpack('5i', f.read(struct.calcsize('5i')))
> DELTA = struct.unpack('f',f.read(struct.calcsize('f')))
> ZEROS9 = struct.unpack('9i', f.read(struct.calcsize('9i')))
> NTITL = struct.unpack('i',f.read(struct.calcsize('i')))
> NTITL = struct.unpack('d',f.read(struct.calcsize('d')))
> TITLE = struct.unpack('i200c',f.read(struct.calcsize('i200c')))
> NATOM = struct.unpack('i',f.read(struct.calcsize('i')))
>
> ISTOP = ISTART + (NSET * NSAVC)
> NSTP = ISTOP - ISTART
>
> print "%(ID)-13s %(DATA)s" %{"ID": "HDR:", "DATA": HDR}
> print "%(ID)-13s %(DATA)s" %{"ID": "Coor Sets:", "DATA": NSET}
> print "%(ID)-13s %(DATA)s" %{"ID": "First Step:", "DATA": ISTART}
> print "%(ID)-13s %(DATA)s" %{"ID": "Save Int.:", "DATA": NSAVC}
> print "%(ID)-13s %(DATA)s" %{"ID": "Last Step:", "DATA": ILAST}
> print "%(ID)-13s %(DATA)s" %{"ID": "5-ZEROS:", "DATA": ZEROS5}
> print "%(ID)-13s %(DATA)s" %{"ID": "Int. Time:", "DATA": DELTA}
> print "%(ID)-13s %(DATA)s" %{"ID": "9-ZEROS:", "DATA": ZEROS9}
> print "%(ID)-13s %(DATA)s" %{"ID": "Title chars:", "DATA": NTITL}
> print "%(ID)-13s %(DATA)s" %{"ID": "Title:", "DATA": TITLE}
> print "%(ID)-13s %(DATA)s" %{"ID": "Atoms:", "DATA": NATOM}
>
> print "\n%(ID)-13s %(DATA)i" %{"ID": "Last Step:", "DATA": ISTOP}
> print "%(ID)-13s %(DATA)i" %{"ID": "Num Steps:", "DATA": NSTP}
>
>
> And the output:
> blub_at_repulsion $ python dcd-header.py ../bdvm/namd/membrane30_400-bdvm-rna-
> smd/run002/memb30_run400-bdvm-rna_npgt.dcd
>
> HDR:          (84, 'C', 'O', 'R', 'D')
> Coor Sets:    1000
> First Step:   1500500
> Save Int.:    500
> Last Step:    (2000000,)
> 5-ZEROS:      (0, 0, 0, 0, 0)
> Int. Time:    (0.040909659117460251,)
> 9-ZEROS:      (1, 0, 0, 0, 0, 0, 0, 0, 0)
> Title chars:  (3.4800730975980617e-312,)
> Title:        (2, 'R', 'E', 'M', 'A', 'R', 'K', 'S', ' ', 'F', 'I', 'L', 'E',
> 'N', 'A', 'M', 'E', '=', 'm', 'e', 'm', 'b', '3', '0', '_', 'r', 'u', 'n',
> '4', '0', '0', '-', 'b', 'd', 'v', 'm', '-', 'r', 'n', 'a', '_', 'n', 'p',
> 'g', 't', '.', 'd', 'c', 'd', ' ', 'C', 'R', 'E', 'A', 'T', 'E', 'D', ' ',
> 'B', 'Y', ' ', 'N', 'A', 'M', 'D', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '
> ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', 'R', 'E', 'M', 'A', 'R', 'K', 'S', ' ',
> 'D', 'A', 'T', 'E', ':', ' ', '0', '7', '/', '0', '5', '/', '1', '0', ' ',
> 'C', 'R', 'E', 'A', 'T', 'E', 'D', ' ', 'B', 'Y', ' ', 'U', 'S', 'E', 'R',
> ':', ' ', 'b', 'l', 'u', 'b', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '
> ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ',
> ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', '\xa4', '\x00', '\x00',
> '\x00', '\x04', '\x00', '\x00', '\x00', '\xa9', '\xfb', '\x01', '\x00',
> '\x04', '\x00', '\x00', '\x00', '0', '\x00', '\x00', '\x00', '\xfe', '{', '<',
> '\xa1', 'E', 'N', 'W', '@', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00',
> '\x00', '\x00', '\xfe', '{', '<', '\xa1')
> Atoms:        (1079463493,)
>
> Last Step:    2000500
> Num Steps:    500000
>
>
>
> --
> Bjoern Olausson
> Martin-Luther-Universität Halle-Wittenberg
> Fachbereich Biochemie/Biotechnologie
> Kurt-Mothes-Str. 3
> 06120 Halle/Saale
>
> Phone: +49-345-55-24942
>

-- 
Dr. Axel Kohlmeyer
akohlmey_at_gmail.com  http://goo.gl/1wk0
Institute for Computational Molecular Science
Temple University, Philadelphia PA, USA.

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:19:47 CST