From: Brian Bennion (brian_at_youkai.llnl.gov)
Date: Tue Nov 02 2004 - 13:09:32 CST
hello Charles,
The real problem is in the implementation of the new load balance
structure and I am clueless in this aspect.
However, negative signs are good when dealing with vdw and electrostatic
potential energies.
The fact that you have them now is a good thing.
I forgot to ask what type of machine/cluster you are using and the OS.
I have never seen the Rtask error before.
What is the size of your system? What ensemble are you using? I noticed
that no box dimensions are given in your input file. The bond and angle
energies appear to high for most large simulations.
Have you tried to minimize small segments of your system first, before
combining them altogether for general minimization?
Finally, did you compile this version of namd/charm or get a binary?  The
core file might also be useful as you get a handle on the problem.
Regards
Brian
On Tue, 2 Nov 2004, Charles Danko wrote:
> Hi,
>
> Thanks to Brian for the help.  I tried changing the interactions term
> to scaled1-4 with the default scaling factor or 1.0 but the
> minimization still died after 199 steps.  The last couple steps are
> pasted below.  The energy is substantially different; I am not sure
> that I understand the negative sign.
>
> If there are any more suggestions I would be grateful.
>
> Thanks again,
> Charles
>
> BRACKET: 2.50603e-07 0.090796 -114345 -267.772 826360
> NEW SEARCH DIRECTION
> INITIAL STEP: 6.4e-05
> GRADIENT TOLERANCE: 14979.9
> ENERGY:     195     14682.3820     22632.4249      7136.2497       282.3447
>   -96578.5816      9615.8599         0.0000         0.0000         0.0000
>   -42229.3205         0.0000    -42229.3205    -42229.3205         0.0000
>
> ENERGY:     196     14352.5204     22506.1721      7126.8372       277.9901
>   -96822.7235      9098.0523         0.0000         0.0000         0.0000
>   -43461.1514         0.0000    -43461.1514    -43461.1514         0.0000
>
> ENERGY:     197     14345.7336     22383.6335      7109.8393       271.1399
>   -97309.3368      8429.8700         0.0000         0.0000         0.0000
>   -44769.1205         0.0000    -44769.1205    -44769.1205         0.0000
>
> ENERGY:     198     16752.9576     22839.6644      7095.3013       264.8575
>   -98275.7787      9208.5645         0.0000         0.0000         0.0000
>   -42114.4333         0.0000    -42114.4333    -42114.4333         0.0000
>
> BRACKET: 0.000123859 2654.69 -4.84743e+07 -1.49018e+07 9.67884e+07
> ENERGY:     199     14522.5983     22380.1473      7106.8166       269.4865
>   -97462.9771      8297.1243         0.0000         0.0000         0.0000
>   -44886.8042         0.0000    -44886.8042    -44886.8042         0.0000
>
> LDB:  LOAD: AVG 121.372 MAX 150.69  MSGS: TOTAL 171 MAXC 36 MAXP 4  None
> LDB:  LOAD: AVG 121.372 MAX 145.646  MSGS: TOTAL 310 MAXC 88 MAXP 4  Alg7
> LDB:  LOAD: AVG 121.372 MAX 123.798  MSGS: TOTAL 310 MAXC 88 MAXP 4  Alg7
> Rtasks fail:
> Rtask(s) 1 : exited with signal <11>
> Rtask(s) 4 2 3 7 5 6 8 10 9 : exited with signal <15>
> Rtask(s) 1  : coredump
>
>
>
>
> On Mon, 1 Nov 2004 15:56:32 -0800 (PST), Brian Bennion
> <brian_at_youkai.llnl.gov> wrote:
> > HI Charles,
> >
> > Is there a particular reason you use a 1-2 exclusion instead of a 1-4
> > exclusion?
> >
> > your vdw energies are not at all good and I think it is because your
> > including too many interactions with the 1-2 based exclusion.
> > Just for kicks, try using the 1-4 term and see how things go.
> >
> > Regards
> > Brian
> >
> > On Mon, 1 Nov 2004, Charles Danko wrote:
> >
> > > Hi,
> > >
> > > I am having a problem minimizing my system.   The end of the
> > > minimization output file is as follows:
> > >
> > > ENERGY:     196     72896.2480     36970.3755      7529.1929       527.0135
> > >    302178.7322    230719.2954         0.0000         0.0000         0.0000
> > >    650820.8576         0.0000    650820.8576    650820.8576         0.0000
> > >
> > > ENERGY:     197     74621.1928     37566.9383      7549.9188       527.7283
> > >    301273.8956    231103.4061         0.0000         0.0000         0.0000
> > >    652643.0800         0.0000    652643.0800    652643.0800         0.0000
> > >
> > > BRACKET: 0.000110482 1822.22 -4.75282e+07 -1.29993e+07 6.80802e+07
> > > ENERGY:     198     73119.1726     37051.1602      7532.4822       526.9988
> > >    302022.8627    230476.5677         0.0000         0.0000         0.0000
> > >    650729.2442         0.0000    650729.2442    650729.2442         0.0000
> > >
> > > BRACKET: 7.36549e-05 1913.84 -1.29993e+07 -1.34062e+06 6.80802e+07
> > > ENERGY:     199     73237.2018     37093.2432      7534.1100       527.0121
> > >    301947.9529    230399.3378         0.0000         0.0000         0.0000
> > >    650738.8579         0.0000    650738.8579    650738.8579         0.0000
> > >
> > > LDB:  LOAD: AVG 163.034 MAX 257.747  MSGS: TOTAL 240 MAXC 44 MAXP 4  None
> > > LDB:  LOAD: AVG 163.034 MAX 195.64  MSGS: TOTAL 526 MAXC 145 MAXP 6  Alg7
> > > LDB:  LOAD: AVG 163.034 MAX 166.29  MSGS: TOTAL 530 MAXC 145 MAXP 6  Alg7
> > > Rtasks fail:
> > > Rtask(s) 1 : exited with signal <11>
> > > Rtask(s) 2 5 4 3 9 6 8 7 10 : exited with signal <15>
> > > Rtask(s) 1  : coredump
> > > >
> > >
> > > I am using the script which is pasted below.  So far, I have tried:
> > > decreasing the parameters minBabyStep and minTinyStep by 1, 2, and 3
> > > orders of magnitude; running minimization with the velocity quenching
> > > algorithm, and starting from a restart file.  In all cases the system
> > > craps out after exactly 199 steps, including when the system is
> > > started from the restart file.
> > >
> > > My input files are rather large (~5MB pdb, ~10MB psf) but I can post
> > > if someone requests.
> > >
> > > I think that I am most likely missing something obvious.
> > >
> > > Thanks in advance for any help,
> > > Charles
> > >
> > > # initial config
> > > coordinates ../final.pdb
> > > structure   ../final.psf
> > > temperature     0
> > >
> > > # output params
> > > outputtiming    1000
> > > outputname      ./output/minimize
> > > binaryoutput    no
> > >
> > > # for restart
> > > restartname ./minimize/system_minimize_restart
> > > restartfreq 100
> > > restartsave yes
> > > binaryrestart yes;  # preserves more accuracy.
> > >
> > > # integrator params
> > > timestep        1.0
> > >
> > > # force field params
> > > paratypecharmm on
> > > parameters par_all27_prot_lipid.prm
> > > exclude         1-2
> > > switching       on
> > > switchdist      8.0
> > > cutoff          12.0
> > > pairlistdist    13.5
> > > stepspercycle   20
> > >
> > > #velocityQuenching on
> > >
> > >
> > > #minimizethe yffstem
> > > minimization on;        #this turns on fast minimization
> > > minTinyStep 1.0e-7;
> > > #minbabystep 1.0e-3;    #If it doesnt work, this option is on for the next run.
> > > minimize 2000
> > >
> >
> > *****************************************************************
> > **Brian Bennion, Ph.D.                                         **
> > **Computational and Systems Biology Division                   **
> > **Biology and Biotechnology Research Program                   **
> > **Lawrence Livermore National Laboratory                       **
> > **P.O. Box 808, L-448    bennion1_at_llnl.gov                     **
> > **7000 East Avenue       phone: (925) 422-5722                 **
> > **Livermore, CA  94550   fax:   (925) 424-6605                 **
> > *****************************************************************
> >
> >
>
*****************************************************************
**Brian Bennion, Ph.D.                                         **
**Computational and Systems Biology Division                   **
**Biology and Biotechnology Research Program                   **
**Lawrence Livermore National Laboratory                       **
**P.O. Box 808, L-448    bennion1_at_llnl.gov                     **
**7000 East Avenue       phone: (925) 422-5722                 **
**Livermore, CA  94550   fax:   (925) 424-6605                 **
*****************************************************************
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:38:57 CST