From: John Stone (johns_at_ks.uiuc.edu)
Date: Sat Jan 03 2004 - 21:35:03 CST

Bogdan,
  Excellent points. Perhaps a quick run of memtest86 on the affected
machine would verify that the hardware is reliable. I just had to replace
a motherboard and CPU in a machine that recently developed problems with
the memory system. (impossible to diagnose which was the source since I
didn't have an extra of either of those, but it was not the memory itself
in this particular case...) The NFS/network thing is also a strong
possibility. That'd be harder to diagnose than hardware problems however.
There are a number of hardware diagnostic tools out there, definitely worth
doing on machines that one intends to use for long-running simulations....

  John Stone
  vmd_at_ks.uiuc.edu

On Sat, Jan 03, 2004 at 05:07:49PM +0100, Bogdan Costescu wrote:
>
> On Fri, 2 Jan 2004, Mauricio C. Tripp wrote:
>
> > I'm sorry, something definitely went wrong with charmm, in the output
> > file it skiped 69 output steps and there's garbage in its place (as if
> > the file was binary) and it corresponds to where VMD finds the error
> > format.
>
> I would suspect more a hardware problem (e.g. memory or disk failure) or a
> low level network problem (if files were written to NFS mounted
> directory).
>
> > After that everything seems fine, so it is like if charmm did something
> > wrong while writing to files for a while and then came back to normal...
>
> ... and this confirms what I wrote above. Could you check if the size of
> the "bad" trajectory part is a power of two ? Does it have a constant
> content ?
>
> > have you heard anything like that before?
>
> I did experience such corruption with some older (2.2) Linux kernels that
> had some NFS problems. People on the Linux NFS development list have
> reported silent data corruption when using NFS in combination with some
> Intel network cards; those did not appear anymore in a while so it might
> have been an issue that could be fixed in software (=driver).
>
> --
> Bogdan Costescu
>
> IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
> Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
> Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
> E-mail: Bogdan.Costescu_at_IWR.Uni-Heidelberg.De

-- 
NIH Resource for Macromolecular Modeling and Bioinformatics
Beckman Institute for Advanced Science and Technology
University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
Email: johns_at_ks.uiuc.edu                 Phone: 217-244-3349              
  WWW: http://www.ks.uiuc.edu/~johns/      Fax: 217-244-6078