AW: Line minimizerfailure because of IMPR?

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Thu Apr 05 2012 - 01:08:31 CDT

Hi,

it's possible that there's a bug in the new implementation of the minimization on the gpu. But I have seen this error before on a broken gpu. Just to be sure, does this error occur on different gpus and only with 2.9b2? I don't think it's a wrong setting for IMPR because that would IMHO cause a unstable simulation and the simulation would just crash with a message like "Simulation has become unstable", but this message shows a missed answer from a gpu and could indicate a hardware issue, or it was just happenstance. Is the error reproducible?

Best wishes
Norman Geist.

> -----Ursprüngliche Nachricht-----
> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im
> Auftrag von Francesco Pietra
> Gesendet: Mittwoch, 4. April 2012 17:02
> An: NAMD
> Betreff: namd-l: Line minimizerfailure because of IMPR?
>
> Hi:
> with cuda v 2.9b2 and 27 FF, I was trying to minimize a new system
> comprising a new transition metal cluster. Minimization failed as
> indicated by the start/end of .log file and gradient trend:
>
> .log file:
>
> ETITLE: TS BOND ANGLE DIHED
> IMPRP ELECT VDW BOUNDARY MISC
> KINETIC TOTAL TEMP POTENTIAL
> TOTAL3 TEMPAVG PRESSURE GPRESSURE
> VOLUME PRESSAVG GPRESSAVG
>
> ENERGY: 0 131516.3834 15951.4182 1089.1031
> 80.7094 -208823.8452 4021786.0851 0.0000
> 0.0000 0.0000 3961599.8541 0.0000
> 3961599.8541 3961599.8541 0.0000 1373635.0459
> 1388330.3068 672033.8185 1373635.0459 1388330.3068
> ..................................................................
>
> LINE MINIMIZER BRACKET: DX 1.62748e-48 1.80832e-43 DU -2.26438e-05
> 8.74043e-06 DUDX 521930 521930 521930
> ENERGY: 608 121296.7924 15556.9336 1102.0259
> 130.0208 -214468.6419 18560.8579 0.0000
> 0.0000 0.0000 -57822.0113 0.0000
> -57822.0113 -57822.0113 0.0000 -14967.9037
> -1012.7549 672033.8185 -14967.9037 -1012.7549
>
> LINE MINIMIZER BRACKET: DX 1.62748e-49 1.80832e-43 DU -1.45298e-05
> 8.74043e-06 DUDX 521930 521930 521930
> LINE MINIMIZER REDUCING GRADIENT FROM 4.44643e+08 TO 444643
> FATAL ERROR: cuda_check_remote_progress polled 1000000 times over
> 101.723663 s on step 609
>
>
>
>
> Gradient:
>
> MINIMIZER STARTING CONJUGATE GRADIENT ALGORITHM
> LINE MINIMIZER REDUCING GRADIENT FROM 9.95266e+08 TO 995266
> ............................
> LINE MINIMIZER REDUCING GRADIENT FROM 4.44643e+08 TO 444643
>
>
> The structure, at the end of the crashed minimization, does not show
> any major distortion. From the above files my impression is of a wrong
> setting of IMPR. I would be very grateful for confirming my feeling ,
> or suggesting otherwise.
>
> francesco pietra

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:24 CST