Re: floating point reproduceability

From: Thomas Brian (thomasbrianxlii_at_gmail.com)
Date: Thu Apr 18 2013 - 23:28:31 CDT

Thanks Norman,
I was thinking along similar lines regarding floating point order of
operations. It would be nice if you could force each patch to be added to
the force totals in the same order. I wonder how much slower it would run.
 I don't know the details of how it gets programmed but I would think
keeping the order of additions the same would be produce a modest increase
in run times. I am curious if it is just a matter of changing a parallel
style for loop over the patch grid to a serial one, or something simple.

If you wish to know what I am doing it relates to the effect of rare events
on the ordering of a line of water molecules. Instead of waiting for these
rare events I manually impinge an extremely large random force on one
molecule and compare it to a case with a normally sized random force.
 Thanks for your help.

On Thu, Apr 18, 2013 at 2:15 AM, Norman Geist <
norman.geist_at_uni-greifswald.de> wrote:

> Hi Thomas,****
>
> ** **
>
> It’s not unusual that result of parallel codes differ in some of the last
> digits. This happens when the distributed work is brought back together. At
> this point, the order of finishing child processes cause rounding errors
> because of the maximum machines precision of 64bit, or the precision that
> has been****
>
> chosen by the programmer for particular variables.****
>
> ** **
>
> Simplified example if I can hold only 2 digits behind the dot, also during
> multiplying (notice the order of incoming results):****
>
> ** **
>
> Case 1: Child1=1.29 Child2=1.01 Child3=0.03 -> Produkt=0.039 = 0.4****
>
> Case2: Child3=0.03 Child2=1.01 Child1=1.29 -> Produkt=0.0387 = 0.4****
>
> ** **
>
> IMHO, this is the way a computer works and the compiler
> can’t do anything here. The question is, if you can call this bad precision.
> ****
>
> Additionally, NAMD uses, as far as I know, only 32bit
> (single) precision for most of the work to save time as they think it
> doesn’t make****
>
> a difference (even the DCD is single precision). Maybe we
> can help you, if you explain what you actually doubt or problem is.****
>
> ** **
>
> Regards ****
>
> ** **
>
> Norman Geist.****
>
> ** **
>
> *Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
> Auftrag von *Thomas Brian
> *Gesendet:* Donnerstag, 18. April 2013 00:09
> *An:* namd-l_at_ks.uiuc.edu
> *Betreff:* namd-l: floating point reproduceability****
>
> ** **
>
> Hi, ****
>
> Question on results reproduceability. Does anyone know if
> reproduceability of results on different processors can be improved, for
> instance, by changing gcc compilation options, or perhaps by some NAMD
> options?****
>
> ** **
>
> I have compiled the mpi version according to the readme file with default
> options for linux-x86_32-g++.****
>
> ** **
>
> I run on a system with some Intel Nehalem E5520 cpus, and some Intel
> Westmere X5650 cpus. Results are identical across machines if NAMD is run
> on one thread. Results are different if using more than one. This
> suggests floating point differences from order of operations? Any way to
> get around this, aside from only running one thread?****
>
> ** **
>
> Thanks,****
>
> Thomas****
>

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:08 CST