From: John Stone (johns_at_ks.uiuc.edu)
Date: Thu Apr 21 2005 - 16:28:08 CDT
Luis,
  It sounds to me like NAMD is encountering a floating point exception
during your run.  Alpha processors don't handle such exceptions gracefully
unless you compile code specially (i.e. slower performance), it's possible
that the version of NAMD you're running has some math that's problematic,
or it's operating on FP values sent to it by VMD that give the Alpha CPUs
problems.  What happens if you run the NAMD side of things on a Linux box?
It should be relatively easy to fix the problem if we can reproduce it
on a machine we can run NAMD on.  These FP exceptions are a familiar issue
on the Alpha platform.
  John Stone
  vmd_at_ks.uiuc.edu
On Wed, Apr 20, 2005 at 10:22:10PM -0500, Luis Rosales wrote:
> 
> Hi all!
> John,
> 
> Here is more concrete info on my problem,
> 
> I am using:
> 
> VMD for LINUX, version 1.8.3 (February 15, 2005) and NAMD 2.5 for
> Tru64-Alpha-Elan (both precompiled binaries).
> 
> I run vmd on a local machine with 2 processors on RedHat Linux kernel
> 2.4.20-8smp and 4 Gb RAM
> NAMD, on the other hand, is running on a remote alpha server SC45, true unix
> 5.1b, using 2 nodes (total 4 processors) with 2 Gb of RAM. 
> 
> I ran this test on a system with 2 protein fragments and 0 waters, for a total
> of 110 residues (1651 atoms). One protein is 85 residues long and the 2nd is
> 25 residues long.
> In this case the simulation runs smoothly until I apply a force on the small
> fragment:
> 
> ------------------------------------
> ENERGY:     115       142.9833       560.0360       549.5308        22.2029  
>        -2084.3060      -333.2643         0.0000         0.0000       257.1398
>           -885.6774        52.2506      -886.1811      -885.7428        52.2506
> 
> ENERGY:     116       150.6783       545.7365       553.3336        22.7285  
>        -2088.9056      -333.2312         0.0000         0.0000       271.5344
>           -878.1255        55.1756      -878.5663      -878.3913        55.1756
> 
> ENERGY:     117       152.8667       531.2275       557.0184        23.1223  
>        -2093.8173      -333.8267         0.0000         0.0000       290.1404
>           -873.2688        58.9563      -873.2699      -874.7542        58.9563
> 
> ENERGY:     118       148.2011       516.6806       560.5185        23.1648  
>        -2099.0759      -335.1546         0.0000         0.0000       316.7026
>           -868.9628        64.3537      -869.0747      -870.0510        64.3537
> 
> ENERGY:     119       141.9092       502.4293       563.7315        22.9768  
>        -2104.5040      -337.0257         0.0000         0.0000       348.3684
>           -862.1145        70.7882      -862.8035      -861.5549        70.7882
> 
> ETITLE:      TS           BOND          ANGLE          DIHED          IMPRP  
>             ELECT            VDW       BOUNDARY           MISC        KINETIC
>               TOTAL           TEMP         TOTAL2         TOTAL3        TEMPAVG
> 
> ENERGY:     120       140.7723       488.7814       566.6368        22.7672  
>        -2110.2922      -339.3565         0.0000         0.0000       371.7358
>           -858.9554        75.5364      -859.0029      -860.2378        75.5364
> 
> DEBUG: Detaching simulation from remote connection
> ENERGY:     121       145.9771       475.7318       569.2369        22.4984  
>        -2116.4710      -341.5928         0.0000         0.0000       388.6565
>           -855.9630        78.9747      -855.7663      -857.8394        78.9747
> 
> prun: ../namd/NAMD_2.5_Tru64-Alpha-Elan/namd2 (host bakliz8 process 1 pid
> 4857226) killed by signal 8 (FPE)
> prun: no core file for job 30792 in /local/core/rms/30792
> ----------------------------------------
> 
> If I look over vmd I only can see:
> 
> Info) Connected to same-endian machine
> Info) Using multithreaded IMD implementation.
> Info) picked atom: 
> Info) ------------
> Info) molecule id: 0
> Info) name: HD1
> Info) type: HD1
> Info) index: 1499
> Info) resname: TRP
> Info) resid: 72
> Info) chain: X
> Info) segname: C
> Info) x: 7.481662
> Info) y: 2.785722
> Info) z: 1.172789
> Info) IMD connection ended unexpectedly; connection terminated.
> 
> **************************************************
> 
> On some cases, specially for the peptides approaching to the 15 residue limit,
>  the message from NAMD is a little different, as if NAMD tried to pause the
> simulation before quit: (this is the output for a system with a 85 residues
> protein and a 16 residues peptide)
> 
> --------------------------------------
> ENERGY:     146       174.5181       464.8836       526.1484        23.4186  
>        -2026.8350      -269.8708         0.0000         0.0000       381.4379
>           -726.2993        84.9705      -725.8214      -727.5702        84.9705
> 
> ENERGY:     147       167.6455       468.3555       525.3417        23.7755  
>        -2026.7811      -270.8533         0.0000         0.0000       390.5575
>           -721.9588        87.0020      -721.9038      -721.9578        87.0020
> 
> NAMD ABORTING DUE TO HARD NAMD_quit().
> Info: Pausing IMD
> prun: ../namd/NAMD_2.5_Tru64-Alpha-Elan/namd2 (host bakliz8 process 0 pid
> 4857107) killed by signal 11 (SEGV)
> prun: no core file for job 30793 in /local/core/rms/30793
> ----------------------------------------
> 
>  As far as I can tell, the problem is not related to the duration of the
> minimization, even with longer minimization runs, the process is killed when
> the system enters the MD if I try to apply a force on a fragment longer than
> 15 residues, and usually the simulation is killed as soon as I click on the
> fragment with my mouse....
> [So, I dont have chance to exceed the speed of light...  :( ]
> 
> So, in theory, there is a limit to the size of the fragment that I could
> move/affect by aplying a reasonable force ?
> 
> Again, thanks for your time and help.
> 
> Kind regards,
> 
> Luis
> 
> 
> --
> Open WebMail Project (http://openwebmail.org)
> 
> 
> ---------- Original Message -----------
> From: John Stone <johns_at_ks.uiuc.edu>
> To: Luis Rosales <ludwig_at_correo.biomedicas.unam.mx>
> Cc: namd-l_at_ks.uiuc.edu, vmd-l_at_ks.uiuc.edu
> Sent: Wed, 20 Apr 2005 16:12:20 -0500
> Subject: namd-l: Re: vmd-l: Interactive Simulations, ligand size limits....
> 
> > Luis,
> >   How large is the molecule you're simulating?  When you say that
> > the simulation "crashed", does that mean that NAMD aborted because
> > you pulled too hard (margin violation, exceeding the speed of light, 
> > etc...), or did NAMD actually have a seg fault or floating point exception
> > of some kind?   If you got a "margin violation" it means that you pulled
> > on the structure so hard that you exceeded NAMD's internal limit on
> > atom velocity imposed by the spatial decomposition it uses.  If you
> > pull to hard and impart a tremendous velocity per simulated timestep,
> >  this means that either you need to pull more gently, or you need to 
> > decrease the gap in timescale between your real-time pulling and the 
> > rate of simulated time in the simulation, either by using a smaller 
> > molecule or by using fixed atoms, or some combination of both.
> > 
> >   John Stone
> >   vmd_at_ks.uiuc.edu
> > 
> > -- 
> > NIH Resource for Macromolecular Modeling and Bioinformatics
> > Beckman Institute for Advanced Science and Technology
> > University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801
> > Email: johns_at_ks.uiuc.edu                 Phone: 217-244-3349         
> >        WWW: http://www.ks.uiuc.edu/~johns/      Fax: 217-244-6078
> ------- End of Original Message -------
-- NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute for Advanced Science and Technology University of Illinois, 405 N. Mathews Ave, Urbana, IL 61801 Email: johns_at_ks.uiuc.edu Phone: 217-244-3349 WWW: http://www.ks.uiuc.edu/~johns/ Fax: 217-244-6078
This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:39:22 CST