Re: long running sim dies

From: ransun (rsu007_at_latech.edu)
Date: Mon Oct 13 2014 - 09:12:33 CDT

I got that error report in the output file for my first simulation. It ran about 30min and died. However, I started a new one a couple minutes later so it covered the output file. I can only remember few words about it. Sorry.
Anyway, other simulations I ran afterward died at random steps without any error report.
Ran Sun

Sent from my iPhone

> On 20141013, at 8:29, "Thomas C. Bishop" <bishop_at_latech.edu> wrote:
>
> Dear NAMD,
> we have been doing some benchmarks on my local computers lately and seems they are dieing
> after running successfully from some time.
> ( NAMD 2.9 for Linux-x86_64-multicore, 32-way SMP opteron, 1 node, 1 physical node
> Uname 3.11.10-21-desktop #1 SMP PREEMPT ... the machine has LOTS of ram and typically one user )
>
> Nothing seems pathological w/ the system (all energies config etc.. are ok)
> There is plenty of disk space for output..etc..
>
> The only error I have was a note from one student
> . I checked the output file and found a fatal error occured, it said that cannot find balancer (not sure, need to check it again).
>
>
> Other runs died w/ no error message.
> Could it possibly be that /usr/local/namd2, which is automounted, is disconnecting?
> system logs do not indicate a hardware problem.
>
> I presume if the namd2 executable is in same directory as the output that I/O will keep the automount active.
> It this true or does namd open/close files between writes?
>
>
> Just trying to figure this one out... guess I need to dust off the sys-admin manual.
>
> Any ideas/comments appreciated.
> Tom
>
>
>
> --
> *******************************
> Thomas C. Bishop
> Tel: 318-257-5209
> Fax: 318-257-3823
> www.latech.edu/~bishop
> ********************************

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:22:55 CST