From: Jérôme Hénin (jerome.henin_at_ibpc.fr)
Date: Wed Apr 08 2015 - 03:16:22 CDT
What happens if you start the apt-get process with "nice -19" ?
On 8 April 2015 at 09:41, Vlastimil Zíma <zima_at_karlov.mff.cuni.cz> wrote:
> The problem found:
> Aparently the cron caused short overload, but long enough to make NAMD get
> stuck. I'm used to run NAMD on all of the processors and cron runs among
> others command "apt-get check -f" which requires so much of processor
> capacity that NAMD doesn't get enough and can't even handle this state.
> Interestingly it affects the CUDA multicore version, I was unable to
> produce the error on no-CUDA version.
> I managed the problem by leaving one of the processors unoccupied. It's
> not very optimal since the processor is unused whole day only to provide
> space for cron run. I suspect there is another solution that would make
> NAMD resistant to the short processor overload.
> 2015-04-01 14:08 GMT+02:00 Vlastimil Zíma <zima_at_karlov.mff.cuni.cz>:
>> In progress:
>> I am very unsure about what causes the problem. Removal of the "apt" and
>> "mlocate" jobs seemed to resolve the issue. On the other hand, I tested the
>> cron itself by introduction of the "sleep 1" cron job which caused the
>> issue if it was accompanied by the rest of the jobs, but the "sleep" job
>> itself (removed all other corn jobs) didn't trigger the stalling. Neither
>> multiple (8) "sleep 0.2" jobs haven't trigger the problem.
>> I used following script to trigger the problem (for cycle with what I
>> found in /etc/crontab)
>> for I in $(seq 100); do echo -n "."; test -x /usr/sbin/anacron || ( cd /
>> && run-parts --report /etc/cron.daily ); done; echo
>> I should also mention I run my NAMD from screen using following command
>> runnamd_cuda 20c_run.namd > 20c_run.output 2> 20c_run.error
>> where runnamd_cuda is following script
>> ulimit -c unlimited
>> namd2-mc-cuda +setcpuaffinity +idlepoll +p6 +devices 0 "$@"
>> The namd2-mc-cuda is compiled binary of namd with multicore and CUDA.
>> 2015-04-01 10:04 GMT+02:00 Vlastimil Zíma <zima_at_karlov.mff.cuni.cz>:
>>> Hi everyone,
>>> I'm using NAMD2.9 on Debians equipped with GPU and I repetitively
>>> encounter wierd behaviour, namd is still running, but no new output i
>>> generated - restart files, dcd, not even the output redirected to a file. I
>>> noticed it usually happens at 6.24 in the morning which led me to discover
>>> that daily cron jobs are run at that time.
>>> Here is the list of my system daily crons
>>> I'm not yet sure which one of those causes the NAMD output to stall
>>> neither what exactly causes the stalling. I was able to reproduce the
>>> problem on two of my machines so far.
>>> Does anybody had any similar issue?
This archive was generated by hypermail 2.1.6 : Tue Dec 27 2016 - 23:21:03 CST