AW: NamdMemoryReduction - genCompressedPsf

From: Norman Geist (norman.geist_at_uni-greifswald.de)
Date: Fri Mar 22 2013 - 02:28:53 CDT

Hi again,

 

First of all, to create the compressed psf, you could also try just to run
it with only 1 core, I think there’s not much to be parallelized in that
case anyway, and the ram of one node should be sufficient.

 

If that doesn’t help, maybe it would be worth it, to find to code part that
prints the message, and to add a print to the size it tries to allocate.
Maybe we can see something there. Are you using a queuing system? Could
there be some restrictions to the maximum memory a job can use. Or check the
output of “ulimit –a” on the nodes, to see if you are allowed to use enough
of the ram. I think if it would be possibly that this is a bug, the
developers would have already turned in. So it is whether some weird thing
in the psf or pdb (maybe try to recreate them) or the configuration of your
machine, IMHO.

 

Norman Geist.

 

Von: Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com]
Gesendet: Freitag, 22. März 2013 05:31
An: Norman Geist
Cc: Namd Mailing List
Betreff: Re: namd-l: NamdMemoryReduction - genCompressedPsf

 

Hi Norman,

 

I actually run the compression on 1 node too but I sent you the output with
32 nodes.

Here I have attached output with 1 node.

 

On each node we have 16G ram, and each node can do 64 tasks at a time. If we
say 16 tasks per node in the script each task would take 1GB of memory.

 

I have run the script with just one node and one task per node, which means
I have a total of 16G memory, which is more than enough for a system with a
few million atoms.

 

but I got the same memory error ....

 

 

On Tue, Mar 19, 2013 at 6:50 PM, Norman Geist
<norman.geist_at_uni-greifswald.de> wrote:

Hi Sridhar,

 

I guess I know what the problem is. It seems that you confused a cluster as
“one machine” which is not true. The output you posted says you are using 32
physical nodes, which points to a cluster, maybe a IBM super computer. But
it’s not one machine, it’s a cluster. And here’s the problem. Some parts of
the initialization phase of namd are only done at one processor, mostly
zero, and this processor needs more ram to read your huge files. So this
single processor cannot use the distributed memory of a cluster in the way
it is currently implemented in namd.

 

You should try to find a node that has enough local memory available.

 

Another, but unlikely, option would be the malloc() command. If it has the
same weakness in c++ as it had in fortran, it can only allocate 2GB at once,
as the input parameter is a signed int, but that’s really unlikely, IMHO
it’s the local memory.

 

How big is this psf file?

 

Best wishes

 

Norman Geist.

 

Von: Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com]

Gesendet: Dienstag, 19. März 2013 01:04

An: Norman Geist
Betreff: Re: namd-l: NamdMemoryReduction - genCompressedPsf

 

Hi Norman,

 

Sorry for the late response, I was on vacation.

 

I have attached the config and log files.

 

Best,

 

 

On Wed, Mar 13, 2013 at 5:54 PM, Norman Geist
<norman.geist_at_uni-greifswald.de> wrote:

Ok sorry, I just meant the namd config file, not the pdb and parameter, and
also the output file, which can’t be big as it crashes.

 

Norman Geist.

 

Von: Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com]
Gesendet: Mittwoch, 13. März 2013 06:09

An: Norman Geist
Betreff: Re: namd-l: NamdMemoryReduction - genCompressedPsf

 

Hi Norman,

 

Thank you very much for your help ....

The files are too big to send over an email...

I will send you the tcl script to build the system and other necessary files
to your email ID only in a day or two.

If we find the solution, then we can post back to NAMD.

 

Best,

 

 

On Tue, Mar 12, 2013 at 5:58 PM, Norman Geist
<norman.geist_at_uni-greifswald.de> wrote:

Can we see the input/output for/of the compression attempt?

 

Norman Geist.

 

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von Sridhar Kumar Kannam

Gesendet: Montag, 11. März 2013 23:18
An: Norman Geist
Cc: Namd Mailing List

Betreff: Re: namd-l: NamdMemoryReduction - genCompressedPsf

 

Hi Norman and NAMD community

 

The node has 16 cores with 1024 GB memory.

 

For compression I am using the normal version only not the memory optimized
version.

 

I still have the same problem....

 

Any suggestions please ..

 

On Mon, Mar 11, 2013 at 5:56 PM, Norman Geist
<norman.geist_at_uni-greifswald.de> wrote:

Hi Sridhar,

 

didn’t know, shame on me, that 1TB nodes are already used out there. By the
way, how many cores does your node has?

Please make sure that for compressing the PSF you did not used the memory
optimized compile of namd. This feature is already implemented in the normal
namd versions. I guess the memopt namd does only read compressed psf files
so it fails cause of that maybe.

 

Norman Geist.

 

Von: Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com]

Gesendet: Samstag, 9. März 2013 06:39
An: Norman Geist

Betreff: Re: namd-l: NamdMemoryReduction - genCompressedPsf

 

Hi Norman,

 

Yes I tried on a node with 1TB RAM memory.

 

Can anyone please suggest a way around ..

 

Thanks.

 

On Fri, Mar 8, 2013 at 6:04 PM, Norman Geist
<norman.geist_at_uni-greifswald.de> wrote:

1TB memory? Could it be that you confuse memory (RAM) with disk space (HD)?

 

Norman Geist.

 

Von: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] Im Auftrag
von Sridhar Kumar Kannam
Gesendet: Freitag, 8. März 2013 04:39
An: namd-l_at_ks.uiuc.edu
Betreff: namd-l: NamdMemoryReduction - genCompressedPsf

 

Hi All,

I am having a system with 3 million atoms. To use memory optimised version
of NAMD, first I need to compress psf file.
I followed the instructions on this web page -
http://www.ks.uiuc.edu/Research/namd/wiki/?NamdMemoryReduction
I am running the first step - genCompressedPsf - on a node with 1 TB
memory, still it is giving the error - {snd:8,rcv:0} Reason: FATAL ERROR:
Memory allocation failed on processor 0.

Any suggestions please

-- 
Cheers !!!
Sridhar  Kumar Kannam :)
 
-- 
Cheers !!!
Sridhar  Kumar Kannam :)
 
-- 
Cheers !!!
Sridhar  Kumar Kannam :)
 
-- 
Cheers !!!
Sridhar  Kumar Kannam :)
 
-- 
Cheers !!!
Sridhar  Kumar Kannam :)
 
-- 
Cheers !!!
Sridhar  Kumar Kannam :)

This archive was generated by hypermail 2.1.6 : Tue Dec 31 2013 - 23:23:04 CST