Re: namd2.6b1 has bad scaling

From: Brian Bennion (brian_at_youkai.llnl.gov)
Date: Wed Jan 25 2006 - 14:03:28 CST

Hello,

Are you sure the charmrun command is needed here?

Most mpi systems have a native command to issue mpi jobs.
mpirun or srun etc....

Another curious detail is the nodelist creation. Doesn't the scheduler
running on the cluster do this for you?
For example, I issue the following command

srun -N100 -n 200 namd2 namd.conf > namd.out

and the scheduler finds 100 nodes with 2 cpus each and places my jobs on
them.

How are the nodes connected? ethernet, giganet, myrinet, quadrics
switches...

Regards
brian

On Wed, 25 Jan 2006, Morad Alawneh wrote:

> Dear everyone,
>
> I have recently compiled namd2.6b1 to have it work in parallel. I have a system of about 92,000 atoms, and when I submit my job for using more than 2 processors
> I'll get bas scaling and the performance is lower than using 2 processors. Would you help to figure out what is the problem.
>
>
> Cluster system:
> It is a Dell 1855 Linux cluster consisting of 630 nodes. Each node is equipped with two Intel Xeon EM64T processors (3.6GHz) and 4 GB of memory.  Each node runs
> its own Linux kernel (Red Hat Enterprise Linux 3). The Intel version of MPICH provides message-passing interface (MPI) across all the nodes.
>
> Installation instructions:
> charm++ installation:
>
> Edit namd/Make.charm
> (set CHARMBASE to the full path to charm-5.9)
>
> Edit the file charm/src/arch/mpi-linux-amd64/conv-mach.sh to have the right
> path to mpicc and mpiCC (replace mpiCC by mpicxx), use mpich/gnu.
>
> ./build charm++ mpi-linux-amd64 gcc --libdir=/opt/mpich/gnu/lib
> --incdir=/opt/mpich/gnu/include --no-build-shared -g -O -verbose -tracemode summary
> -memory default -DCMK_OPTIMIZE=1
>
>
> namd Installation:
>
> Edit various configuration files:
> namd/arch/Linux-amd64-MPI.arch (fix CHARMARCH to be mpi-linux-amd64-gcc)
> namd/arch/Linux-amd64.fftw  (fix path to files, and delete all
>                              -I$(HOME)/fftw/include -L$(HOME)/fftw/lib)
> namd/arch/Linux-amd64.tcl   (fix path to files, and delete all
>                              -I$(HOME)/tcl/include -L$(HOME)/tcl/lib)
>
> Set up build directory and compile:
>   ./config tcl fftw Linux-amd64-MPI
>   cd Linux-amd64-MPI
>   make
>
>
> Job submission script:
>
> #!/bin/bash
>
> #PBS -l nodes=1:ppn=2,walltime=336:00:00
> #PBS -N Huge_gA
> #PBS -m abe
> #PBS -M alawneh_at_chem.byu.edu
>
> # The maximum memory allocation = 2.86 GB
> let Memory=128*1024*1024
> export P4_GLOBMEMSIZE=$Memory
>
> #cd into the directory where I typed qsub
> cd $PBS_O_WORKDIR
>
> TMPDIR=/ibrix/scr/$PBS_JOBID
> PROG=/ibrix/apps/biophysics/namd/Linux-amd64-MPI/namd2
> ARGS="+strategy USE_GRID"
> IFILE="min_sys.namd"
> OFILE="min_sys.log"
>
> # NP should always be: nodes*ppn from #PBS -l above
> let NP=1*2
>
> # Full path to charmrun
> CHARMRUN=/ibrix/apps/biophysics/namd/Linux-amd64-MPI/charmrun
>
> # Preparing nodelist file for charmrun
> export NODES=`cat $PBS_NODEFILE`
> export NODELIST=nodelist
> echo group main > $NODELIST
> for node in $NODES ; do
> echo host $node ++shell ssh >> $NODELIST
> done
>
> # Execute the job using charmrun
> $CHARMRUN $PROG $ARGS $IFILE +p$NP ++nodelist $NODELIST > $OFILE
>
> exit 0
>
>
> Thanks
>
>
>
>
> --
>
>
>
> Morad Alawneh
>
> Department of Chemistry and Biochemistry
>
> C100 BNSN, BYU
>
> Provo, UT 84602
>
>
>

************************************************
  Brian Bennion, Ph.D.
  Biosciences Directorate
  Lawrence Livermore National Laboratory
  P.O. Box 808, L-448 bennion1_at_llnl.gov
  7000 East Avenue phone: (925) 422-5722
  Livermore, CA 94550 fax: (925) 424-6605
************************************************

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:43:16 CST