Re: file writing stopped

From: francesco oteri (francesco.oteri_at_gmail.com)
Date: Mon Apr 19 2010 - 10:47:30 CDT

using strace -p namd_process_id, i've obtained this result:

".......................................................
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}, {fd=6, events=POLLOUT}], 5, 0) = 1 ([{fd=6,
revents=POLLOUT}])
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
select(10, [9], NULL, NULL, {0, 0}) = 0 (Timeout)
select(13, [12], NULL, NULL, {0, 0}) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
poll([{fd=9, events=POLLIN}, {fd=12, events=POLLIN}, {fd=7, events=POLLIN},
{fd=6, events=POLLIN}], 4, 0) = 0 (Timeout)
........................................................................"

Is it possible that, for any reason, namd remains trapped in the polling
cycle? Is There any way to detect the point in the source code where this
happens?

2010/4/14 francesco oteri <francesco.oteri_at_gmail.com>

> Dear NAMD users,
> I'm running 2 simulations of 10ns on 2 different intel quad core PC with
> Ubuntu 9.10 4GB RAM each one, using binary cuda version (CVS 2010-03-26)
> with 2cpu on a "GeForce GTS 250" . The folder containing namd is hosted on
> a
> server and the 2 pc access to it through NFS.
>
> I'm experiencing a strange behaviour: though namd seems to run (how
> displayed by top command) after a random number of steps, no data are
> written to disk.
>
> I've started the simulations 10 days ago and both are still running (
> though should be finished this night) , but the first displayed the
> problem
> after 1day (around 900000 steps) while the second displayed the problem
> this
> morning after 4858000 steps.
>
> I observed that Xorg use 100% of a CPU, but killing it doesn't solve the
> problem; the first simulations shows the problem more often that the
> second.
>
>

-- 
Cordiali saluti, Dr.Oteri Francesco

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:55:41 CST