From: John Stone (johns_at_ks.uiuc.edu)
Date: Wed Jun 17 2009 - 11:33:20 CDT

Rob,
  With fortran codes you have three problems to worry about with
these kinds of files:
  1) endianness (determined by host machine and/or OS in the
                 rare case of bi-endian chips)
  2) integer word size (determined by compiler, compiler flags, source code)
  3) record marker word size (determined by compiler, compiler flags)

Endianness (1) and record marker word size (3) can usually be determined by
doing a variety of tests on the first record marker you find. An example
of this is the VMD DCD plugin. Because we know the expected size of parts
of the DCD header, we can predict what the record markers should contain,
and by doing a series of tests on them, we can figure out whether the
DCD file was written in big- or little-endian mode, and with further work
we can ascertain the other word sizes (well, at least to the point that
we bother to support some of the very unusual combinations).

So long as there's a portion of the file format that's predictable
and fixed, you can likely use a similar strategy, and I would expect it
to work fine on a wide variety of machines. If you want to see the
kind of hoop-jumping that's involved, you can delve into the VMD
DCD plugin to see how it is being done there.

Cheers,
  John Stone
  vmd_at_ks.uiuc.edu

On Wed, Jun 17, 2009 at 08:59:23AM -0700, Rob wrote:
>
> Axel Kohlmeyer wrote:
> >
> > in .dcd files the first record is exactly 84 bytes and the first entry
> > is a string 'CORD', so the first 4 or 8
> bytes should be the integer
> > number 84 followed by 'C' 'O' 'R' 'D'.
> if you look at a binary file
> > in a hex viewer, you can check it out
>
> I compile ABINIT on my Fedora 10 and 11 Linux system with
> GCC 4.3.2.
>
> The ABINIT binary data files appear with an undocumented tag at
> the start of the file; this tag depends on the endianness and the
> record length.
> With a record length 4 (or 8), it has a 4 (or 8) bytes tag, as follows:
>
> Little-endian / 4 : 0e 00 00 00
> Little-endian / 8 : 0e 00 00 00 00 00 00 00
>
> Big-endian / 4 : 00 00 00 0e
> Big-endian / 8 : 00 00 00 00 00 00 00 0e
>
> I can use these tags to set the endianness and record length of
> the binary data file, and then read the binary data accordingly.
>
> However, so far I have only generated binary data files on my
> Linux system. I have no idea how generic these tags are.
> How machine dependent would my code be if I rely on these
> tags.....??What would you guess?
> I myself don't yet have other architectures to test it on.
>
> Rob.
>
>
>
>

-- 
NIH Resource for Macromolecular Modeling and Bioinformatics
Beckman Institute for Advanced Science and Technology
University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
Email: johns_at_ks.uiuc.edu                 Phone: 217-244-3349
  WWW: http://www.ks.uiuc.edu/~johns/      Fax: 217-244-6078