From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Fri Feb 05 2010 - 11:14:19 CST
On Fri, Feb 5, 2010 at 11:37 AM, Santanu Chatterjee
<santanu.chatter_at_gmail.com> wrote:
> HI,
>    I am running a set of few hundred simulations with NAMD. I am running
> each of these simulations in parallel
> on 4 processors each. These simulations are supposed to be running for few
> weeks. I noticed that some of
it is a _very_ bad idea to have simulations go on without
restarting for that long a time (and expecting them to work well).
> them died without finishing. At the end of the screen output file, I got the
> following error message :
>
> FATAL ERROR: Input/output error
> ------------- Processor 0 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: Input/output error
>
> Stack Traceback:
>   [0] CmiAbort+0x4f  [0x7eb405]
>   [1] _Z8NAMD_diePKc+0x62  [0x4b31e2]
>   [2] _Z13write_dcdstepiiPfS_S_Pd+0x650  [0x4b41e0]
it seems to be dying while writing a dcd frame.
if you are not running out of disk space, perhaps you
are running out of quota, or are trying to write too large
a file or trying to write to a directory that does not exist.
axel.
>   [3] _ZN6Output14output_dcdfileEiiP11FloatVectorPK7Lattice+0x46c
> [0x6b8368]
>   [4] _ZN6Output10coordinateEiiP6VectorP11FloatVectorR7Lattice+0x9b
> [0x6b64e1]
>   [5]
> _ZN24CkIndex_CollectionMaster39_call_receivePositions_CollectVectorMsgEPvP16CollectionMaster+0x18f
> [0x4c44cf]
>   [6] CkDeliverMessageFree+0x21  [0x786a6b]
>   [7] _Z15_processHandlerPvP11CkCoreState+0x4a9  [0x7860c9]
>   [8] CsdScheduleForever+0xa2  [0x7f18a2]
>   [9] CsdScheduler+0x1c  [0x7f14a0]
>   [10] _ZN7BackEnd7suspendEv+0xb  [0x4bab01]
>   [11] _ZN9ScriptTcl7Tcl_runEPvP10Tcl_InterpiPPc+0x122  [0x6fc260]
>   [12] TclInvokeStringCommand+0x91  [0x80cc78]
>   [13] /afs/crc.nd.edu/x86_64_linux/namd/NAMD_2.6_Linux-amd64/namd2
> [0x842ac8]
>   [14] Tcl_EvalEx+0x176  [0x84310b]
>   [15] Tcl_EvalFile+0x134  [0x83ab14]
>   [16] _ZN9ScriptTcl3runEPc+0x14  [0x6fb99e]
>   [17] main+0x21b  [0x4b69c3]
>   [18] __libc_start_main+0xf4  [0x307f01d994]
>   [19] _ZNSt8ios_base4InitD1Ev+0x3a  [0x4b2b5a]
> Fatal error on PE 0> FATAL ERROR: Input/output error
>
> I am trying to understand the source of this error. I am sure that this is
> not disk space issue. Can someone please help?
>
> Thanks,
> Santanu
>
-- Dr. Axel Kohlmeyer akohlmey_at_gmail.com Institute for Computational Molecular Science College of Science and Technology Temple University, Philadelphia PA, USA.
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:53:45 CST