Re: NamdMemoryReduction - genCompressedPsf

From: JC Gumbart (gumbart_at_ks.uiuc.edu)
Date: Wed Apr 03 2013 - 22:50:22 CDT

I believe this is the same as a known problem that was fixed by Chao Mei a few months back. Try downloading one of the nightly builds and using it to compress the PSF.

On Apr 3, 2013, at 2:00 AM, Sridhar Kumar Kannam wrote:

> Hi Norman,
>
> I just found that compression is working fine with millions of atoms protein+water system.
> I am having the problem when I have SiN (generated using inorganic builder) or graphene (generated using nanotube builder) as a part of the system.
>
> Thank you very much.
>
>
> On Wed, Apr 3, 2013 at 4:05 PM, Sridhar Kumar Kannam <srisriphy_at_gmail.com> wrote:
> Hi Norman,
>
> Thanks for the suggestions. I am trying all possible ways.
> Running on 1 core also didn't help.
> Yes, I am using the queuing system and we don't have any restrictions on the memory, we can use the total 16GB for each job.
>
> I don't think its a bug in the psf/pdb file, I created a entirely different system and tried to compress the psf files, but still got the same errors.
> The error could be due to the configuration of the machine. Actually I tried on two different machines, but got the same error.
>
>
>
> On Fri, Mar 22, 2013 at 6:28 PM, Norman Geist <norman.geist_at_uni-greifswald.de> wrote:
> Hi again,
>
>
>
> First of all, to create the compressed psf, you could also try just to run it with only 1 core, I think there’s not much to be parallelized in that case anyway, and the ram of one node should be sufficient.
>
>
>
> If that doesn’t help, maybe it would be worth it, to find to code part that prints the message, and to add a print to the size it tries to allocate. Maybe we can see something there. Are you using a queuing system? Could there be some restrictions to the maximum memory a job can use. Or check the output of “ulimit –a” on the nodes, to see if you are allowed to use enough of the ram. I think if it would be possibly that this is a bug, the developers would have already turned in. So it is whether some weird thing in the psf or pdb (maybe try to recreate them) or the configuration of your machine, IMHO.
>
>
>
> Norman Geist.
>
>
>
> Von: Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com]
> Gesendet: Freitag, 22. März 2013 05:31
>
>
> An: Norman Geist
> Cc: Namd Mailing List
> Betreff: Re: namd-l: NamdMemoryReduction - genCompressedPsf
>
>
>
> Hi Norman,
>
>
>
> I actually run the compression on 1 node too but I sent you the output with 32 nodes.
>
> Here I have attached output with 1 node.
>
>
>
> On each node we have 16G ram, and each node can do 64 tasks at a time. If we say 16 tasks per node in the script each task would take 1GB of memory.
>
>
>
> I have run the script with just one node and one task per node, which means I have a total of 16G memory, which is more than enough for a system with a few million atoms.
>
>
>
> but I got the same memory error ....
>
>
>
>
>
> On Tue, Mar 19, 2013 at 6:50 PM, Norman Geist <norman.geist_at_uni-greifswald.de> wrote:
>
> Hi Sridhar,
>
>
>
> I guess I know what the problem is. It seems that you confused a cluster as “one machine” which is not true. The output you posted says you are using 32 physical nodes, which points to a cluster, maybe a IBM super computer. But it’s not one machine, it’s a cluster. And here’s the problem. Some parts of the initialization phase of namd are only done at one processor, mostly zero, and this processor needs more ram to read your huge files. So this single processor cannot use the distributed memory of a cluster in the way it is currently implemented in namd.
>
>
>
> You should try to find a node that has enough local memory available.
>
>
>
> Another, but unlikely, option would be the malloc() command. If it has the same weakness in c++ as it had in fortran, it can only allocate 2GB at once, as the input parameter is a signed int, but that’s really unlikely, IMHO it’s the local memory.
>
>
>
> How big is this psf file?
>
>
>
> Best wishes
>
>
>
> Norman Geist.
>
>
>
> Von: Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com]
>
> Gesendet: Dienstag, 19. März 2013 01:04
>
> An: Norman Geist
> Betreff: Re: namd-l: NamdMemoryReduction - genCompressedPsf
>
>
>
> Hi Norman,
>
>
>
> Sorry for the late response, I was on vacation.
>
>
>
> I have attached the config and log files.
>
>
>
> Best,
>
>
>
>
>
> On Wed, Mar 13, 2013 at 5:54 PM, Norman Geist <norman.geist_at_uni-greifswald.de> wrote:
>
> Ok sorry, I just meant the namd config file, not the pdb and parameter, and also the output file, which can’t be big as it crashes.
>
>
>
> Norman Geist.
>
>
>
> Von: Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com]
> Gesendet: Mittwoch, 13. März 2013 06:09
>
>
> An: Norman Geist
> Betreff: Re: namd-l: NamdMemoryReduction - genCompressedPsf
>
>
>
> Hi Norman,
>
>
>
> Thank you very much for your help ....
>
> The files are too big to send over an email...
>
> I will send you the tcl script to build the system and other necessary files to your email ID only in a day or two.
>
> If we find the solution, then we can post back to NAMD.
>
>
>
> Best,
>
>
>
>
>
> On Tue, Mar 12, 2013 at 5:58 PM, Norman Geist <norman.geist_at_uni-greifswald.de> wrote:
>
> Can we see the input/output for/of the compression attempt?
>
>
>
> Norman Geist.
>
>
>
> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag von Sridhar Kumar Kannam
>
>
> Gesendet: Montag, 11. März 2013 23:18
> An: Norman Geist
> Cc: Namd Mailing List
>
> Betreff: Re: namd-l: NamdMemoryReduction - genCompressedPsf
>
>
>
> Hi Norman and NAMD community
>
>
>
> The node has 16 cores with 1024 GB memory.
>
>
>
> For compression I am using the normal version only not the memory optimized version.
>
>
>
> I still have the same problem....
>
>
>
> Any suggestions please ..
>
>
>
> On Mon, Mar 11, 2013 at 5:56 PM, Norman Geist <norman.geist_at_uni-greifswald.de> wrote:
>
> Hi Sridhar,
>
>
>
> didn’t know, shame on me, that 1TB nodes are already used out there. By the way, how many cores does your node has?
>
> Please make sure that for compressing the PSF you did not used the memory optimized compile of namd. This feature is already implemented in the normal namd versions. I guess the memopt namd does only read compressed psf files so it fails cause of that maybe.
>
>
>
> Norman Geist.
>
>
>
> Von: Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com]
>
> Gesendet: Samstag, 9. März 2013 06:39
> An: Norman Geist
>
> Betreff: Re: namd-l: NamdMemoryReduction - genCompressedPsf
>
>
>
> Hi Norman,
>
>
>
> Yes I tried on a node with 1TB RAM memory.
>
>
>
> Can anyone please suggest a way around ..
>
>
>
> Thanks.
>
>
>
> On Fri, Mar 8, 2013 at 6:04 PM, Norman Geist <norman.geist_at_uni-greifswald.de> wrote:
>
> 1TB memory? Could it be that you confuse memory (RAM) with disk space (HD)?
>
>
>
> Norman Geist.
>
>
>
> Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag von Sridhar Kumar Kannam
> Gesendet: Freitag, 8. März 2013 04:39
> An: namd-l_at_ks.uiuc.edu
> Betreff: namd-l: NamdMemoryReduction - genCompressedPsf
>
>
>
> Hi All,
>
>
>
> I am having a system with 3 million atoms. To use memory optimised version of NAMD, first I need to compress psf file.
> I followed the instructions on this web page - http://www.ks.uiuc.edu/Research/namd/wiki/?NamdMemoryReduction
> I am running the first step - genCompressedPsf - on a node with 1 TB memory, still it is giving the error - {snd:8,rcv:0} Reason: FATAL ERROR: Memory allocation failed on processor 0.
>
> Any suggestions please
>
>
>
> --
> Cheers !!!
> Sridhar Kumar Kannam :)
>
>
>
>
>
>
> --
> Cheers !!!
> Sridhar Kumar Kannam :)
>
>
>
>
>
>
> --
> Cheers !!!
> Sridhar Kumar Kannam :)
>
>
>
>
>
>
> --
> Cheers !!!
> Sridhar Kumar Kannam :)
>
>
>
>
>
>
> --
> Cheers !!!
> Sridhar Kumar Kannam :)
>
>
>
>
>
>
>
> --
> Cheers !!!
> Sridhar Kumar Kannam :)
>
>
>
>
>
>
> --
> Cheers !!!
> Sridhar Kumar Kannam :)
>
>
>
>
>
>
> --
> Cheers !!!
> Sridhar Kumar Kannam :)
>
>
>

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:05 CST