Re: namd scale-up

From: Revthi Sanker (revthi.sanker1990_at_gmail.com)
Date: Sat Sep 14 2013 - 00:26:51 CDT

D
ear Sir,
This is the benchmark details that you had requested for:

 # of nodes Real Time taken for 2ns
-------------------------------------------------
4 15hrs
5 13 hrs
6 11 hrs
7 9hrs 33 mins
8 9 hrs 5 mins
9 8hrs 49 mins
16 7 hrs 23 mins

At the maximum, I can get 6 ns/day if I use all the nodes and all
processors ( our clusters limit is 16 nodes* 16 processors=256). Is that
the maximum possible for the system size of 3,00,000 atoms or can it be
improved?

Thank you so much for your time in advance.

Revathi.S
M.S. Research Scholar
Indian Institute Of Technology, Madras
India
_________________________________

On Fri, Sep 6, 2013 at 12:39 PM, Norman Geist <
norman.geist_at_uni-greifswald.de> wrote:

> Hi again,****
>
> ** **
>
> what I saw from you output of “/proc/cpuinfo”, all the 16 cores on the
> machine are real physical cores, so no need to worry about scaling issues
> regarding virtual cores here. So far, so good. Now you need to do
> benchmarks for one node up to 8 or more nodes. This means simply run the
> same simulation on various numbers of nodes for only some steps and note
> down the reported “Benchmark Time”. Afterwards post them here and we can
> tell you, if your scaling is efficient or not, and therefore if there is
> more to get out of it.****
>
> ** **
>
> Norman Geist.****
>
> ** **
>
> *Von:* Revthi Sanker [mailto:revthi.sanker1990_at_gmail.com]
> *Gesendet:* Freitag, 6. September 2013 08:26
> *An:* Norman Geist
> *Cc:* Namd Mailing List
> *Betreff:* Re: namd-l: namd scale-up****
>
> ** **
>
> Dear Sir,****
>
> I am herewith attaching the details which I obtained by longing into one
> of the nodes in my cluster.****
>
> I would also like to bring to your notice that when the namd run has
> finished the *test.err* file displays:
>
> ****
>
> --------------------------------------------------------------------------
> ****
>
> WARNING: It appears that your OpenFabrics subsystem is configured to only*
> ***
>
> allow registering part of your physical memory. This can cause MPI jobs to
> ****
>
> run with erratic performance, hang, and/or crash.****
>
> ** **
>
> This may be caused by your OpenFabrics vendor limiting the amount of****
>
> physical memory that can be registered. You should investigate the****
>
> relevant Linux kernel module parameters that control how much physical****
>
> memory can be registered, and increase them to allow registering all****
>
> physical memory on your machine.****
>
> ** **
>
> See this Open MPI FAQ item for more information on these Linux kernel
> module****
>
> parameters:****
>
> ** **
>
> http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages****
>
> ** **
>
> Local host: a3n83****
>
> Registerable memory: 32768 MiB****
>
> Total memory: 65511 MiB****
>
> ** **
>
> Your MPI job will continue, but may be behave poorly and/or hang.****
>
> --------------------------------------------------------------------------
> ****
>
> [a3n83:20048] 127 more processes have sent help message
> help-mpi-btl-openib.txt****
>
> / reg mem limit low****
>
> [a3n83:20048] Set MCA parameter "orte_base_help_aggregate" to 0 to see all
> help****
>
> / error messages****
>
> ** **
>
> I am a beginner to simulations and I am unable to interpret the err
> message. thought this could be relevant.****
>
> ** **
>
> Thank you so much for your time. ****
>
> ** **
>
> ****
>
> *P*****
>
> *FA: /proc/cpuinfo* ****
>

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:40 CST