Re: NAMD2.7 on BluegeneL hang at "LDB: Central LB being created..."

From: Dong Luo (us917_at_yahoo.com)
Date: Fri Mar 04 2011 - 08:08:38 CST

I didn't say it clearly. I'm using the CVS version of Charm++, but modified the configure file to skip MPI test, otherwise it will refuse to compile. But now I run into another problem with the fresh compiled namd2. With virtual node enabled, the simulation will get an "FATAL ERROR: Memory allocation failed on processor 0." at step about 121000 (repeatable). The simulation system contains only 50905 atoms. Disable virtual mode solves the problem but slows the calculation speed from "256 CPUs 0.0204817 s/step 0.237057 days/ns" to "128 CPUs 0.0299266 s/step 0.346373 days/ns". Each physical CPU has 2 nodes on it. Thats why 128 CPUs can be counted as 256 when in virtual node mode. Dong ________________________________ From: Chris Harrison <charris5_at_gmail.com> To: Dong Luo <us917_at_yahoo.com> Cc: akohlmey_at_gmail.com; namd-l_at_ks.uiuc.edu Sent: Thu, March 3, 2011 9:08:20 PM Subject: Re: namd-l: NAMD2.7 on BluegeneL hang at "LDB: Central LB being created..." Are you really using Charm++ 2.2?! Is there a reason?  This may work for you, but you should really upgrade to Charm++ 6.2.1 or later when possible.  Otherwise you're missing improvements to performance from the more recent Charm++ versions. Best, Chris -- Chris Harrison, Ph.D. Theoretical and Computational Biophysics Group NIH Resource for Macromolecular Modeling and Bioinformatics Beckman Institute for Advanced Science and Technology University of Illinois, 405 N. Mathews Ave., Urbana, IL 61801 char_at_ks.uiuc.edu                          Voice: 217-244-1733 http://www.ks.uiuc.edu/~char              Fax:  217-244-6078 Dong Luo <us917_at_yahoo.com> writes: > Date: Thu, 3 Mar 2011 17:59:10 -0800 (PST) > From: Dong Luo <us917_at_yahoo.com> > To: Chris Harrison <charris5_at_gmail.com>, akohlmey_at_gmail.com > Cc: namd-l_at_ks.uiuc.edu > Subject: Re: namd-l: NAMD2.7 on BluegeneL hang at "LDB: Central LB being >  created..." > X-Mailer: YahooMailRC/559 YahooMailWebService/0.8.109.292656 > > Chris, the CVS version of namd/charm++ work. Only that I have to comment out >MPI > > checking in the configure file of charm++ because it fails on Bluegene/L. It is > > not checked in charm++ 2.2. > > Axel, namd/charm++ are cross-compiled on Bluegene/L because the login host uses > > different OS compared to the cluster nodes. I did not figure out a way to test > charm++. > > Dong > >   > > ________________________________ > From: Chris Harrison <charris5_at_gmail.com> > To: Dong Luo <us917_at_yahoo.com> > Cc: namd-l_at_ks.uiuc.edu > Sent: Thu, March 3, 2011 1:41:50 AM > Subject: Re: namd-l: NAMD2.7 on BluegeneL hang at "LDB: Central LB being > created..." > > We've made recent improvements to startup and load-balancing.  Can you > try the CVS version or one of the nightly builds of namd, with the most > recent git archive or nightly build of charm++? > > Best, > Chris > > > -- > Chris Harrison, Ph.D. > Theoretical and Computational Biophysics Group > NIH Resource for Macromolecular Modeling and Bioinformatics > Beckman Institute for Advanced Science and Technology > University of Illinois, 405 N. Mathews Ave., Urbana, IL 61801 > > char_at_ks.uiuc.edu                          Voice: 217-244-1733 > http://www.ks.uiuc.edu/~char              Fax:  217-244-6078 > > > Dong Luo <us917_at_yahoo.com> writes: > > Date: Wed, 2 Mar 2011 19:42:57 -0800 (PST) > > From: Dong Luo <us917_at_yahoo.com> > > To: namd-l_at_ks.uiuc.edu > > Subject: namd-l: NAMD2.7 on BluegeneL hang at "LDB: Central LB being > >  created..." > > X-Mailer: YahooMailRC/555 YahooMailWebService/0.8.109.292656 > > > > Hi, > > > > I'm trying to use colvars in NAMD2.7 for distance restraints. There is no > > precompiled version for BluegeneL in the download section. I downloaded the > > source code and compiled following the instructions on this link: > > http://bluegene.bnl.gov/comp/buildnamd.html > > > > However, the simulation (no matter with colvars or not) using this namd2 2.7 > > version always hang after Startup phase 5 as shown in the log: > > " > > Info: REMOVING COM VELOCITY 0.0209799 0.0192793 0.000362722 > > Info: LARGEST PATCH (156) HAS 345 ATOMS > > Info: Startup phase 3 took 0.246489 s, 17.3047 MB of memory in use > > Info: PME using 40 and 32 processors for FFT and reciprocal sum. > > Info: PME GRID LOCATIONS: 7 15 23 27 31 39 47 55 59 63 ... > > Info: PME TRANS LOCATIONS: 3 11 19 29 35 43 51 61 67 75 ... > > Info: Startup phase 4 took 0.00254185 s, 17.3047 MB of memory in use > > Info: Startup phase 5 took 0.0261579 s, 17.3047 MB of memory in use > > LDB: Central LB being created... > > " > > namd2 2.6 version can run normally, but lacks the colvars function I assume. > > > > Any directions? > > > > Thank you. > > > > Dong > > > > > > > >      > > >     

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:19:53 CST