Re: Is there solution to numerical inaccuracy

From: Ilya Chorny (ichorny_at_gmail.com)
Date: Thu Nov 29 2007 - 13:53:47 CST

I did a few more quick tests on my system I compared a restart on one
processor vs two processors vs 8 processors and got the same values at time
0 post crash, and which were different to the values prior to the crash.

On Nov 29, 2007 11:24 AM, Ilya Chorny <ichorny_at_gmail.com> wrote:

> Now that you mention it. I see similar behavior, but I am running my jobs
> in parrallel.
>
> ETITLE: TS BOND ANGLE DIHED
> IMPRP ELECT VDW BOUNDARY
> MISC KINETIC TOTAL TEMP
> TOTAL2 TOTAL3 TEMPAVG PRESSURE
> GPRESSURE VOLUME PRESSAVG GPRESSAVG
>
> (pre-crash)
> ENERGY: 891500 5841.2240 26412.4629 12465.9673
> 746.1914 -349566.4658 -1290.1452 0.0000 0.0000
> 63097.4897 -242293.2757 297.3976 -241835.7998 -
> 241850.8284 297.6611 127.6311 159.4130
> 923993.5682 121.9322 122.0840
>
>
> (post-crash)
> ENERGY: 0 5841.2240 26412.4629 12465.9673
> 746.1914 -349566.3622 -1290.2427 0.0000 0.0000
> 63088.4531 -242302.3063 297.3550 -241848.0768 -
> 241848.0768 297.3550 132.1837 158.4817
> 923993.8056 132.1837 158.4817
>
>
> It's interesting how the bond energies are perfectly conserved but the
> non-bonded are not. My jobs crash all the time do to equipment problems and
> thus I am concerned about how this will effect my results.
>
>
> Thanks,
>
> Ilya
>
>
>
> > non-determinism of the Langevin thermostat in parallel has been talked
> > about.
> >
> > So comming back to square one, after reading all the comments in this
> > discussion, I believe there exist NO solution to this problem that is
> > occuring either because of numerical inaccuracy or non-determinism.
> >
> > Could the B1 and B2 MD runs be considered as good as single A MD run.
> >
> > -Alok
> >
> > Peter Freddolino wrote:
> >
> > >Hi Alok,
> > >just to verify, since you're running NVT, did you specify a seed value
> > >in your config file for the A-B1-B2 simulations? And were your
> > >production runs serial or parallel? If your production runs are done in
> > >parallel then the differences you observe in the first part of your
> > >email are really unremarkable, and have nothing to do with precision
> > and
> > >everything to do with the nondeterminism of the langevin thermostat in
> > >parallel that has been mentioned earlier.
> > >Best,
> > >Peter
> > >
> > >Alok Juneja wrote:
> > >
> > >
> > >>Dear Peter, Dave, Himanshu & other list member,
> > >>
> > >>Sorry for not answering ealier though I was regularly following the
> > >>discussion on this issue. As requested by Peter, I am providing my
> > >>findings about this issue..
> > >>
> > >>I am running constant temperature 50 ns dynamics, total of 25000000
> > >>steps with time step of 0.002ps and dcdfreq of 100 however restartfreq
> > >>of 100000. Somehow my MD crashed at 5459300 but my last restrart was
> > >>5400000. I restarted with this. I am doing this MD to see the protein
> > >>behavious and am calculating the N and C terminal distance (Ang.).
> > >>Following is the N-C terminal distance before crash and after crash. I
> >
> > >>am running this simulation in parallel.
> > >>
> > >># TIME(PS) Before-Crash After-Crash
> > >>10800 10.833
> > >>10800.2 11.3259 11.0924
> > >>10800.4 11.2417 11.1039
> > >>10800.6 10.985 10.9962
> > >>10800.8 10.7715 11.1593
> > >>10801 11.3783 11.4828
> > >>10801.2 11.1862 10.9861
> > >>10801.4 11.3925 10.9671
> > >>10801.6 10.8473 10.9287
> > >>(*) 10801.8 10.5789 11.013
> > >>10802 10.8792 10.4324
> > >>10802.2 10.6182 10.4422
> > >>10802.4 10.8918 10.6541
> > >>10802.6 10.9267 10.7829
> > >>10802.8 10.6352 10.8386
> > >>10803 10.8069 10.4295
> > >>(*) 10803.2 11.3242 10.5952 (*) 10803.4
> > >>11.3397 10.4784
> > >>(*) 10803.6 11.5822 10.4696
> > >>(*) 10803.8 11.023 10.8231
> > >>10804 10.9887 10.4586
> > >>10804.2 10.5118 10.3266
> > >>(*) 10804.4 10.4329 9.95989
> > >>10804.6 10.6863 10.2366
> > >>(*) 10804.8 11.3551 10.2149
> > >>(*) 10805 11.3445 9.88589
> > >>10805.2 10.7702 10.1757
> > >>10805.4 10.4436 10.3636
> > >>10805.6 10.3206 10.2086
> > >>10805.8 10.8214 10.5937
> > >>10806 11.2742 10.3849
> > >>10806.2 11.44 10.2721
> > >>(*) 10806.4 11.2566 10.1909
> > >>10806.6 10.9381 10.7606
> > >>10806.8 11.5617 10.8286
> > >>10807 11.7283 11.246
> > >>10807.2 11.4038 11.2901
> > >>10807.4 10.5862 10.708
> > >>10807.6 10.61 10.6308
> > >>10807.8 11.1818 10.2391
> > >>10808 11.3433 10.5278
> > >>10808.2 11.1947 11.0142
> > >>10808.4 10.9988 11.2578
> > >>(*) 10808.6 10.447 11.334
> > >>10808.8 10.3205 10.9368
> > >>10809 10.7634 10.9165
> > >>10809.2 10.7874 11.1041
> > >>10809.4 11.011 11.15
> > >>10809.6 10.8222 10.9214
> > >>10809.8 10.8731 10.2806
> > >>10810 11.0003 10.908
> > >>
> > >>You will find so many time steps where the difference is remarkable
> > >>(indicated by *). I believe that these difference is too much for me.
> > >>I checked this and found that this is not the case with CHARMM where
> > >>you get the identical results even after restart.
> > >>
> > >>For your ready reference, I am attaching the total energy graph for
> > >>comparision (comparision.pdf
> > >>[ http://www.geocities.com/junejaalok/comparision.pdf]<http://www.geocities.com/junejaalok/comparision.pdf%5D>
> > ).
> > >>As requested by Dave, I am attaching file A-B1-B2.pdf
> > >>[http://www.geocities.com/junejaalok/A-B1-B2.pdf ], the job run on
> > >>single same processor.
> > >>
> > >>Test A energy profile on [http://www.geocities.com/junejaalok/testA.txt
> > ]
> > >>TestB1 energy profile on [
> > http://www.geocities.com/junejaalok/testB1.txt]
> > >>TestB2 energy profile on [http://www.geocities.com/junejaalok/testB2.txt
> > ]
> > >>
> > >>since, i am restricted the with the amount of characters that one can
> > >>write in NAMD forum and the size of attachments, I am putting an extra
> >
> > >>links for you to see the files and results..hope you understand.
> > >>
> > >>I appreciate your efforts to get into the depth. But I believe the
> > >>NAMD developers should really think over this issue..however, any
> > >>solution and suggestions in this regard would be of great help for
> > >>others as well..
> > >>
> > >>
> > >>Best Wishes,
> > >>Alok
> > >>
> > >>
> > >
> > >
> > >
> > >
> >
> >
>
>
> --
> Ilya Chorny Ph.D.
>

-- 
Ilya Chorny Ph.D.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:45:37 CST