NAMD memory problems on ASC's SGI Altix machine

From: Sterling Paramore (paramore_at_hec.utah.edu)
Date: Thu Jan 19 2006 - 10:46:33 CST

Hi, I'm having some trouble running NAMD on an SGI Altix machine. I'm
using the precompiled binary from the website and I'm trying to run a
172,000 atom simultion on 128 processors (I tried compiling it myself,
but it had the same problem and was 2x slower). When NAMD starts up, it
says that it's using 14720 kB of memory. However, after about 130,000
steps, the job crashes and I get the following error from LSF,

TERM_MEMLIMIT: job killed after reaching LSF memory usage limit.
Exited with exit code 143.

Resource usage summary:

    CPU time :1205194.00 sec.
    Max Memory : 115208 MB
    Max Swap : -2097151 MB

    Max Processes : 129
    Max Threads : 129

So the job actually ended up using 115GB of memory! Also, when I try to
use a smaller number of processors, the job crashes earlier than 130,000
steps with a similar error (e.g., when I try 70 processors, the job
crashes after about 6000 steps). Any ideas?

Thanks,
Sterling

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:43:15 CST