Re: NamdMemoryReduction - genCompressedPsf

From: Sridhar Kumar Kannam (srisriphy_at_gmail.com)
Date: Wed Apr 03 2013 - 01:00:06 CDT

Hi Norman,

I just found that compression is working fine with millions of atoms
protein+water system.
I am having the problem when I have SiN (generated using inorganic builder)
or graphene (generated using nanotube builder) as a part of the system.

Thank you very much.

On Wed, Apr 3, 2013 at 4:05 PM, Sridhar Kumar Kannam <srisriphy_at_gmail.com>wrote:

> Hi Norman,
>
> Thanks for the suggestions. I am trying all possible ways.
> Running on 1 core also didn't help.
> Yes, I am using the queuing system and we don't have any restrictions on
> the memory, we can use the total 16GB for each job.
>
> I don't think its a bug in the psf/pdb file, I created a entirely
> different system and tried to compress the psf files, but still got the
> same errors.
> The error could be due to the configuration of the machine. Actually I
> tried on two different machines, but got the same error.
>
>
>
> On Fri, Mar 22, 2013 at 6:28 PM, Norman Geist <
> norman.geist_at_uni-greifswald.de> wrote:
>
>> Hi again,****
>>
>> ** **
>>
>> First of all, to create the compressed psf, you could also try just to
>> run it with only 1 core, I think there’s not much to be parallelized in
>> that case anyway, and the ram of one node should be sufficient.****
>>
>> ** **
>>
>> If that doesn’t help, maybe it would be worth it, to find to code part
>> that prints the message, and to add a print to the size it tries to
>> allocate. Maybe we can see something there. Are you using a queuing system?
>> Could there be some restrictions to the maximum memory a job can use. Or
>> check the output of “ulimit –a” on the nodes, to see if you are allowed to
>> use enough of the ram. I think if it would be possibly that this is a bug,
>> the developers would have already turned in. So it is whether some weird
>> thing in the psf or pdb (maybe try to recreate them) or the configuration
>> of your machine, IMHO. ****
>>
>> ** **
>>
>> Norman Geist.****
>>
>> ** **
>>
>> *Von:* Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com]
>> *Gesendet:* Freitag, 22. März 2013 05:31
>>
>> *An:* Norman Geist
>> *Cc:* Namd Mailing List
>> *Betreff:* Re: namd-l: NamdMemoryReduction - genCompressedPsf****
>>
>> ** **
>>
>> Hi Norman,****
>>
>> ** **
>>
>> I actually run the compression on 1 node too but I sent you the output
>> with 32 nodes.****
>>
>> Here I have attached output with 1 node.****
>>
>> ** **
>>
>> On each node we have 16G ram, and each node can do 64 tasks at a time. If
>> we say 16 tasks per node in the script each task would take 1GB of memory.
>> ****
>>
>> ** **
>>
>> I have run the script with just one node and one task per node, which
>> means I have a total of 16G memory, which is more than enough for a system
>> with a few million atoms.****
>>
>> ** **
>>
>> but I got the same memory error ....****
>>
>> ** **
>>
>> ** **
>>
>> On Tue, Mar 19, 2013 at 6:50 PM, Norman Geist <
>> norman.geist_at_uni-greifswald.de> wrote:****
>>
>> Hi Sridhar,****
>>
>> ****
>>
>> I guess I know what the problem is. It seems that you confused a cluster
>> as “one machine” which is not true. The output you posted says you are
>> using 32 physical nodes, which points to a cluster, maybe a IBM super
>> computer. But it’s not one machine, it’s a cluster. And here’s the
>> problem. Some parts of the initialization phase of namd are only done at
>> one processor, mostly zero, and this processor needs more ram to read your
>> huge files. So this single processor cannot use the distributed memory of a
>> cluster in the way it is currently implemented in namd.****
>>
>> ****
>>
>> You should try to find a node that has enough local memory available.****
>>
>> ****
>>
>> Another, but unlikely, option would be the malloc() command. If it has
>> the same weakness in c++ as it had in fortran, it can only allocate 2GB at
>> once, as the input parameter is a signed int, but that’s really unlikely,
>> IMHO it’s the local memory. ****
>>
>> ****
>>
>> How big is this psf file?****
>>
>> ****
>>
>> Best wishes****
>>
>> ****
>>
>> Norman Geist.****
>>
>> ****
>>
>> *Von:* Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com] ****
>>
>> *Gesendet:* Dienstag, 19. März 2013 01:04****
>>
>> *An:* Norman Geist
>> *Betreff:* Re: namd-l: NamdMemoryReduction - genCompressedPsf****
>>
>> ****
>>
>> Hi Norman,****
>>
>> ****
>>
>> Sorry for the late response, I was on vacation.****
>>
>> ****
>>
>> I have attached the config and log files.****
>>
>> ****
>>
>> Best,****
>>
>> ****
>>
>> ****
>>
>> On Wed, Mar 13, 2013 at 5:54 PM, Norman Geist <
>> norman.geist_at_uni-greifswald.de> wrote:****
>>
>> Ok sorry, I just meant the namd config file, not the pdb and parameter,
>> and also the output file, which can’t be big as it crashes.****
>>
>> ****
>>
>> Norman Geist.****
>>
>> ****
>>
>> *Von:* Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com]
>> *Gesendet:* Mittwoch, 13. März 2013 06:09****
>>
>>
>> *An:* Norman Geist
>> *Betreff:* Re: namd-l: NamdMemoryReduction - genCompressedPsf****
>>
>> ****
>>
>> Hi Norman,****
>>
>> ****
>>
>> Thank you very much for your help ....****
>>
>> The files are too big to send over an email...****
>>
>> I will send you the tcl script to build the system and other necessary
>> files to your email ID only in a day or two.****
>>
>> If we find the solution, then we can post back to NAMD.****
>>
>> ****
>>
>> Best,****
>>
>> ****
>>
>> ****
>>
>> On Tue, Mar 12, 2013 at 5:58 PM, Norman Geist <
>> norman.geist_at_uni-greifswald.de> wrote:****
>>
>> Can we see the input/output for/of the compression attempt?****
>>
>> ****
>>
>> Norman Geist.****
>>
>> ****
>>
>> *Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
>> Auftrag von *Sridhar Kumar Kannam****
>>
>>
>> *Gesendet:* Montag, 11. März 2013 23:18
>> *An:* Norman Geist
>> *Cc:* Namd Mailing List****
>>
>> *Betreff:* Re: namd-l: NamdMemoryReduction - genCompressedPsf****
>>
>> ****
>>
>> Hi Norman and NAMD community****
>>
>> ****
>>
>> The node has 16 cores with 1024 GB memory.****
>>
>> ****
>>
>> For compression I am using the normal version only not the memory
>> optimized version.****
>>
>> ****
>>
>> I still have the same problem....****
>>
>> ****
>>
>> Any suggestions please ..****
>>
>> ****
>>
>> On Mon, Mar 11, 2013 at 5:56 PM, Norman Geist <
>> norman.geist_at_uni-greifswald.de> wrote:****
>>
>> Hi Sridhar,****
>>
>> ****
>>
>> didn’t know, shame on me, that 1TB nodes are already used out there. By
>> the way, how many cores does your node has?****
>>
>> Please make sure that for compressing the PSF you did not used the memory
>> optimized compile of namd. This feature is already implemented in the
>> normal namd versions. I guess the memopt namd does only read compressed psf
>> files so it fails cause of that maybe.****
>>
>> ****
>>
>> Norman Geist.****
>>
>> ****
>>
>> *Von:* Sridhar Kumar Kannam [mailto:srisriphy_at_gmail.com] ****
>>
>> *Gesendet:* Samstag, 9. März 2013 06:39
>> *An:* Norman Geist****
>>
>> *Betreff:* Re: namd-l: NamdMemoryReduction - genCompressedPsf****
>>
>> ****
>>
>> Hi Norman,****
>>
>> ****
>>
>> Yes I tried on a node with 1TB RAM memory.****
>>
>> ****
>>
>> Can anyone please suggest a way around ..****
>>
>> ****
>>
>> Thanks.****
>>
>> ****
>>
>> On Fri, Mar 8, 2013 at 6:04 PM, Norman Geist <
>> norman.geist_at_uni-greifswald.de> wrote:****
>>
>> 1TB memory? Could it be that you confuse memory (RAM) with disk space
>> (HD)? ****
>>
>> ****
>>
>> Norman Geist.****
>>
>> ****
>>
>> *Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
>> Auftrag von *Sridhar Kumar Kannam
>> *Gesendet:* Freitag, 8. März 2013 04:39
>> *An:* namd-l_at_ks.uiuc.edu
>> *Betreff:* namd-l: NamdMemoryReduction - genCompressedPsf****
>>
>> ****
>>
>> Hi All,****
>>
>>
>>
>> I am having a system with 3 million atoms. To use memory optimised
>> version of NAMD, first I need to compress psf file.
>> I followed the instructions on this web page -
>> http://www.ks.uiuc.edu/Research/namd/wiki/?NamdMemoryReduction
>> I am running the first step - genCompressedPsf - on a node with 1 TB
>> memory, still it is giving the error - {snd:8,rcv:0} Reason: FATAL ERROR:
>> Memory allocation failed on processor 0.
>>
>> Any suggestions please
>>
>>
>>
>> --
>> Cheers !!!
>> Sridhar Kumar Kannam :)****
>>
>>
>>
>> ****
>>
>> ****
>>
>> --
>> Cheers !!!
>> Sridhar Kumar Kannam :)****
>>
>>
>>
>> ****
>>
>> ****
>>
>> --
>> Cheers !!!
>> Sridhar Kumar Kannam :)****
>>
>>
>>
>> ****
>>
>> ****
>>
>> --
>> Cheers !!!
>> Sridhar Kumar Kannam :)****
>>
>>
>>
>> ****
>>
>> ****
>>
>> --
>> Cheers !!!
>> Sridhar Kumar Kannam :)
>>
>> ****
>>
>>
>>
>> ****
>>
>> ** **
>>
>> --
>> Cheers !!!
>> Sridhar Kumar Kannam :)
>>
>>
>> ****
>>
>
>
>
> --
> Cheers !!!
> Sridhar Kumar Kannam :)
>
>
>
>

-- 
Cheers !!!
Sridhar  Kumar Kannam :)

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:05 CST