Re: Unable to read corrupted binary DCD trajectory file

From: Viswanath Pasumarthi (v.pasumarthi_at_iitg.ernet.in)
Date: Mon Sep 08 2014 - 15:27:12 CDT

> On 09/08/2014 11:38 AM, Viswanath Pasumarthi wrote:
>> I am using 64-bit Matlab R2013b version, and the file system is NTFS,
which can handle files as large as 16 TB. I agree it is safe to follow
your technique, but I am curious to know the reason for the size limit
> Another thing to check is which *NAMD* version you were running, and
whether it's a 32-bit build. Since the final file size is exactly (?)
4GB,
> it is likely the damage was done when running the simulation.
I used 64-bit Windows 7 operating system, NAMD_2.9_Win32-multicore, NTFS
file system to store DCD file. The file size is exactly 4GB.
>> and
>> more importantly, extract the data from this DCD file for now.
> Assuming NAMD simply stopped writing to the disk, catdcd might be worth
a
> try, with an optional command-line limit on the number of frames to
extract. However, there are indications this assumption may not be
valid.
> Specifically, it is entirely possible that the beginning of the
trajectory
> got overwritten with the data that was meant to go beyond 4GB, which
would
> make it a bit difficult to extract useful information from the file.
Your fears have come true. From what I see from the readdcd run, the DCD
header is corrupted as the NAMD tried to overwrite the DCD file once the
file size reached the apparent limit of 4 GB (reason is still mystifying
to me). The important information relating to endoffile, N (no. of atoms),
timestep, DELTA etc. have been overwritten. I understand it would not be
possible to read the file anymore.
> Lessons learned (purely in my opinion):
> - There are reasons why good practices dictate splitting up long MD runs
into shorter legs (incidentally, a big part of the reason for the
existence of catdcd and the "velocities" keyword in NAMD is to make this
easier to do). There are many ways trajectory files can get lost or
corrupted during the course of a simulation...
thank you, will follow henceforth...

> - Windows is not an operating system for serious computational science.
For starters, a lot of important softwares in the field are not natively
supported, and one needs to install compatibility layers like Cygwin to
access them as well as to tap the powers of the UNIX shell, which are
almost indispensable in this line of work. And as shown by this
incident,
> even when you do come across an application that is natively supported
(such as NAMD), it might not be as thoroughly tested under all possible
usage scenarios as its Linux counterpart. Indeed, it is not impossible
that you've hit an issue in the windows binaries, and given the
> aforementioned "best practice", it seems entirely possible that nobody
tried this before.

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:21:13 CST