Re: Input/output error

From: Axel Kohlmeyer (
Date: Tue May 18 2010 - 04:18:20 CDT

2010/5/18 王棽 <>:
> Dear NAMD users:
> I am running NAMD on Dawning5000A super computer,
> "". However, I found my NAMD processes
> vulnerable on such a platfrom. They usually died with an input/output error
> of the *.restart.coor, *.restart.vel or *.restart.xsc files. There is an
> example of stand output below:

> I contacted with the engineers of the super computer center, and they found
> there was a temporary lustre terminal connection break and reconnect event
> when such input/output error happened, which is quite often observed during
> the communication of compute nodes and OSS nodes.
> Do you have any suggestion on this problem?

call you "super" engineers again and tell them to do their job!

this is definitely a problem of the machine and its configuration.
i find it pretty hilarious that the system managers tell you that
they see this error happen and imply that it is a failure of your
application. NAMD is being using on lustre file systems at
a very large scale (NCSA's abe and lincoln cluster, NICS'
cray xt5 and others) successfully.

and NAMD is not really putting a large strain on the I/O
subsystem. other programs should create much worse issues.


> Cheers.
> Shen.
> ________________________________
> 网易为中小企业免费提供企业邮箱(自主域名)

Dr. Axel Kohlmeyer
Institute for Computational Molecular Science
Temple University, Philadelphia PA, USA.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:54:08 CST