Re: Is there solution to numerical inaccuracy

From: Ilya Chorny (ichorny_at_gmail.com)
Date: Thu Nov 29 2007 - 13:24:50 CST

Now that you mention it. I see similar behavior, but I am running my jobs in
parrallel.

ETITLE: TS BOND ANGLE DIHED
IMPRP ELECT VDW BOUNDARY
MISC KINETIC TOTAL TEMP
TOTAL2 TOTAL3 TEMPAVG PRESSURE
GPRESSURE VOLUME PRESSAVG GPRESSAVG

(pre-crash)
ENERGY: 891500 5841.2240 26412.4629 12465.9673
746.1914
-349566.4658 -1290.1452 0.0000 0.0000 63097.4897
-242293.2757 297.3976 -241835.7998 -241850.8284
297.6611
127.6311 159.4130 923993.5682 121.9322 122.0840

(post-crash)
ENERGY: 0 5841.2240 26412.4629 12465.9673
746.1914
-349566.3622 -1290.2427 0.0000 0.0000 63088.4531
-242302.3063 297.3550 -241848.0768 -241848.0768
297.3550
132.1837 158.4817 923993.8056 132.1837 158.4817

It's interesting how the bond energies are perfectly conserved but the
non-bonded are not. My jobs crash all the time do to equipment problems and
thus I am concerned about how this will effect my results.

Thanks,

Ilya

> non-determinism of the Langevin thermostat in parallel has been talked
> about.
>
> So comming back to square one, after reading all the comments in this
> discussion, I believe there exist NO solution to this problem that is
> occuring either because of numerical inaccuracy or non-determinism.
>
> Could the B1 and B2 MD runs be considered as good as single A MD run.
>
> -Alok
>
> Peter Freddolino wrote:
>
> >Hi Alok,
> >just to verify, since you're running NVT, did you specify a seed value
> >in your config file for the A-B1-B2 simulations? And were your
> >production runs serial or parallel? If your production runs are done in
> >parallel then the differences you observe in the first part of your
> >email are really unremarkable, and have nothing to do with precision and
> >everything to do with the nondeterminism of the langevin thermostat in
> >parallel that has been mentioned earlier.
> >Best,
> >Peter
> >
> >Alok Juneja wrote:
> >
> >
> >>Dear Peter, Dave, Himanshu & other list member,
> >>
> >>Sorry for not answering ealier though I was regularly following the
> >>discussion on this issue. As requested by Peter, I am providing my
> >>findings about this issue..
> >>
> >>I am running constant temperature 50 ns dynamics, total of 25000000
> >>steps with time step of 0.002ps and dcdfreq of 100 however restartfreq
> >>of 100000. Somehow my MD crashed at 5459300 but my last restrart was
> >>5400000. I restarted with this. I am doing this MD to see the protein
> >>behavious and am calculating the N and C terminal distance (Ang.).
> >>Following is the N-C terminal distance before crash and after crash. I
> >>am running this simulation in parallel.
> >>
> >># TIME(PS) Before-Crash After-Crash
> >>10800 10.833
> >>10800.2 11.3259 11.0924
> >>10800.4 11.2417 11.1039
> >>10800.6 10.985 10.9962
> >>10800.8 10.7715 11.1593
> >>10801 11.3783 11.4828
> >>10801.2 11.1862 10.9861
> >>10801.4 11.3925 10.9671
> >>10801.6 10.8473 10.9287
> >>(*) 10801.8 10.5789 11.013
> >>10802 10.8792 10.4324
> >>10802.2 10.6182 10.4422
> >>10802.4 10.8918 10.6541
> >>10802.6 10.9267 10.7829
> >>10802.8 10.6352 10.8386
> >>10803 10.8069 10.4295
> >>(*) 10803.2 11.3242 10.5952 (*) 10803.4
> >>11.3397 10.4784
> >>(*) 10803.6 11.5822 10.4696
> >>(*) 10803.8 11.023 10.8231
> >>10804 10.9887 10.4586
> >>10804.2 10.5118 10.3266
> >>(*) 10804.4 10.4329 9.95989
> >>10804.6 10.6863 10.2366
> >>(*) 10804.8 11.3551 10.2149
> >>(*) 10805 11.3445 9.88589
> >>10805.2 10.7702 10.1757
> >>10805.4 10.4436 10.3636
> >>10805.6 10.3206 10.2086
> >>10805.8 10.8214 10.5937
> >>10806 11.2742 10.3849
> >>10806.2 11.44 10.2721
> >>(*) 10806.4 11.2566 10.1909
> >>10806.6 10.9381 10.7606
> >>10806.8 11.5617 10.8286
> >>10807 11.7283 11.246
> >>10807.2 11.4038 11.2901
> >>10807.4 10.5862 10.708
> >>10807.6 10.61 10.6308
> >>10807.8 11.1818 10.2391
> >>10808 11.3433 10.5278
> >>10808.2 11.1947 11.0142
> >>10808.4 10.9988 11.2578
> >>(*) 10808.6 10.447 11.334
> >>10808.8 10.3205 10.9368
> >>10809 10.7634 10.9165
> >>10809.2 10.7874 11.1041
> >>10809.4 11.011 11.15
> >>10809.6 10.8222 10.9214
> >>10809.8 10.8731 10.2806
> >>10810 11.0003 10.908
> >>
> >>You will find so many time steps where the difference is remarkable
> >>(indicated by *). I believe that these difference is too much for me.
> >>I checked this and found that this is not the case with CHARMM where
> >>you get the identical results even after restart.
> >>
> >>For your ready reference, I am attaching the total energy graph for
> >>comparision (comparision.pdf
> >>[http://www.geocities.com/junejaalok/comparision.pdf]<http://www.geocities.com/junejaalok/comparision.pdf%5D>
> ).
> >>As requested by Dave, I am attaching file A-B1-B2.pdf
> >>[http://www.geocities.com/junejaalok/A-B1-B2.pdf], the job run on
> >>single same processor.
> >>
> >>Test A energy profile on [http://www.geocities.com/junejaalok/testA.txt]
> >>TestB1 energy profile on [http://www.geocities.com/junejaalok/testB1.txt
> ]
> >>TestB2 energy profile on [http://www.geocities.com/junejaalok/testB2.txt
> ]
> >>
> >>since, i am restricted the with the amount of characters that one can
> >>write in NAMD forum and the size of attachments, I am putting an extra
> >>links for you to see the files and results..hope you understand.
> >>
> >>I appreciate your efforts to get into the depth. But I believe the
> >>NAMD developers should really think over this issue..however, any
> >>solution and suggestions in this regard would be of great help for
> >>others as well..
> >>
> >>
> >>Best Wishes,
> >>Alok
> >>
> >>
> >
> >
> >
> >
>
>

-- 
Ilya Chorny Ph.D.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:45:37 CST