RE: ATTENTION: 0031-408 4 tasks allocated by LoadLeveler, continuing

From: Bennion, Brian (Bennion1_at_llnl.gov)
Date: Tue Jan 10 2012 - 14:18:06 CST

Hello,
More email copies of the same error are only food for the trolls.

You should check the compatibility of the following commands in your script:
#@ total_tasks = 4

And poe arguments:
-nodes 16 -tasks_per_node 8

To me it looks like you are requesting more tasks in the poe command than asking for in the resource manager commands.

Brian

-----Original Message-----
From: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] On Behalf Of Gurunath Katagi
Sent: Tuesday, January 10, 2012 12:03 PM
To: namd-l_at_ks.uiuc.edu
Subject: namd-l: ATTENTION: 0031-408 4 tasks allocated by LoadLeveler, continuing

Dear all,
i am trying to run a simulation of solvated protein using NAMD 2.8 version on IMB cluster ..
The job just starts and terminates immediately. i have pasted the last part of .log file up to which it has run

Info: ABSOLUTE IMPRECISION IN VDWB TABLE FORCE: 3.10193e-25 AT 9.94673
Info: RELATIVE IMPRECISION IN VDWB TABLE FORCE: 1.07087e-15 AT 9.94673
Info: Startup phase 8 took 0.610009 s, 183.422 MB of memory in use
Info: Startup phase 9 took 0.000552893 s, 187.547 MB of memory in use
Info: Finished startup at 9.32463 s, 187.547 MB of memory in use

and in .error file , i am getting this error:

ATTENTION: 0031-408 4 tasks allocated by LoadLeveler, continuing...
------------- Processor 2 Exiting: Caught Signal ------------
------------- Processor 3 Exiting: Caught Signal ------------
Signal: 4
Signal: 4
ERROR: 0031-250 task 0: Terminated
ERROR: 0031-250 task 2: Terminated
ERROR: 0031-250 task 3: Terminated
ERROR: 0031-250 task 1: Terminated

The machine configuration goes like this :
$uname -a
Linux cnode39 2.6.5-7.244-pseries64 #1 SMP Mon Dec 12 18:32:25 UTC 2005 ppc64 ppc64 ppc64 GNU/Linux

and the submission file is as follows:
#!/bin/sh
# @ error = job1.$(Host).$(Cluster).$(
Process).err
# @ output = job1.$(Host).$(Cluster).$(Process).out
# @ class = ptask64
# @ job_type = parallel
# @ total_tasks = 4
# @ blocking = unlimited
# @ wall_clock_limit=01:00:00
# @ queue
/usr/bin/poe /home/staff/sec/secdpal/gurunath/NAMD_2.8_Source/Linux-POWER-xlC/namd2 'md1.conf' -nodes 16 -tasks_per_node 8

I am not getting why this error is coming ( due to numerical error or installation or something else) and how to go about Can anybody please look into this and let me know...

Thank you

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:07 CST