Balancer and singularities ...

From: Nicholas M Glykos (glykos_at_mbg.duth.gr)
Date: Thu Jun 10 2010 - 10:52:05 CDT

Dear All,

We have a situation that looks and feels like hardware problem, but I
thought I could ask just in case anyone has seen it before. The problem is
the following:

At step (8 x LDBperiod) (when the tracing stops ?), the balancer bombs-out
with something like this (portions of the log shown):

..

ENERGY: 60320 54.5252 157.6152 83.2358 ...

LDB: ============= START OF LOAD BALANCING ============== 2259.16
LB: Singular Matrix
LB: Singular Matrix
...
LB: Model for object 10239 found
LB: New model completely constructed
LDB: TIME 2259.75 LOAD: AVG 4.87852 MAX 5.20117 PROXIES: TOTAL 828 MAXPE 53 MAXPATCH 4 None 1.25924
LDB: TIME 2259.79 LOAD: AVG 4.87852 MAX 5.11777 PROXIES: TOTAL 828 MAXPE 53 MAXPATCH 4 RefineTorusLB 1.25924
LDB: ============== END OF LOAD BALANCING =============== 2259.79
ENERGY: 64320 60.0195 155.8684 75.8565 ...

LDB: ============= START OF LOAD BALANCING ============== 2504.88
Error in estimation:
object 0: real time=0.000000, model error=0.000689, default error=0.000000
object 1: real time=0.000000, model error=-0.020384, default error=0.000000

..

and the simulation comes to a halt (but doesn't die). Switching off the
balancer (with 'ldBalancer none') bypasses the problem.

Any previous experience with this type of error message gratefully
received.

Regards,
Nicholas

ps. The reason I think that it hardware (or, god forbit, filesystem
corruption) is that it is not reproducible. The same scripts and
executables show different behaviour on the 9th of June (nothing works)
and on the 10th (everything works). The obvious solution, ie. to work only
on the 10th, is not viable ;-) On the other hand, if it hardware, why it
only shows-up at (8 x LDBperiod) and only through the balancer ? Confusing
.. (it must be hardware).

ps[2]. Please CC any answers directly to me, the list's majordomo
sometimes delivers messages with significant delay.

-- 
          Dr Nicholas M. Glykos, Department of Molecular Biology
     and Genetics, Democritus University of Thrace, University Campus,
  Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620,
    Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:54:13 CST