Using Your Warewulf Cluster
This exercise should be done while logged in as a normal user, not
as root. The CentOS installation already helped you create one, but you
can also create a normal user account with the command "useradd
username" and then set the password with
"passwd username".
Part 1: Run NAMD
NAMD is a parallel molecular dynamics application developed in our
group. It is the main application run on our clusters.
- Copy the files NAMD_2.6b1_Linux-i686.tar.gz (NAMD binary)
and apoa1.tar.gz (sample NAMD simulation)
from the workshop CD and untar them in your home directory with:
tar xzf apoa1.tar.gz
tar xzf NAMD_2.6b1_Linux-i686.tar.gz
- cd NAMD_2.6b1_Linux-i686
- Use a text editor to create the file nodelist containing:
group main
host node0000
host node0001
host node0002
The nodelist file tells NAMD what nodes to run on. When we run
under the queueing system below we'll use a script to create this
file.
- Start NAMD on all three machines with:
./charmrun ++remote-shell /usr/bin/rsh ++nodelist nodelist +p3 ./namd2 ~/apoa1/apoa1.namd
If you have problems, or want to see what's going on in the launch
process, add ++verbose to the charmrun command
line.
- When NAMD reaches the line that says "TIMING 20 ..." kill it with
Control-C and jot down the wallclock s/step number.
- Run NAMD again on two processors (change +p3 above to +p2) for
20 steps and compare the performance between the two. Do three
processors run three times as fast as one? How close to three
times?
Note: rsh is disabled on the master node by default for
security reasons, otherwise we could use it as a fourth
processor. Tachyon (used next) works with all four simply
because it does not depend on the use of rsh for
communication.
Part 2: Compile and Run Tachyon
Tachyon is a parallel ray tracer developed by John Stone for his
master's thesis. It is an example of a typical MPI application.
- Copy the file tachyon-0.97.tar.gz (Tachyon source and examples)
from the workshop CD and untar them in your home directory with:
tar xzf tachyon-0.97.tar.gz
-
cd tachyon/unix
- Use a text editor to open the file Make-arch
- Search for the config options for "linux-lam:"
- Copy this set of options to a new entry.
- Change (in the new entry) linux-lam to linux-mpich
- Change "CC = hcc" to "CC = gcc"
- Change -I$(LAMHOME)/h to -I/opt/mpich/include
- Change -L$(LAMHOME)/lib to -L/opt/mpich/lib
- Change -lmpi to -lmpich
- Save, quit the editor and run "make linux-mpich"
to build tachyon. If this doesn't work you probably missed
on of the edits above, or applied them in the wrong place.
The tachyon binary will end up in compile/linux-mpich/.
- cd (back to your home directory)
- Use a text editor to create the file machines containing:
hostname
node0000
node0001
node0002
- Run Tachyon on the three slave machines with:
/opt/mpich/bin/mpirun -v -np 4 -machinefile machines \
tachyon/compile/linux-mpich/tachyon +V tachyon/scenes/balls.dat
- Look at the timing output, which is broken into different
stages of the calculation. Run on one, two, and three processors
(change -np 4 to number of processors) and calculate
speedups for the different stages as well as the total time.
Part 3: Run Under Grid Engine
Sun Grid Engine (SGE) is a free, open souce, general purpose,
cross platform queueing system. In the geneology of queueing systems,
it is a descendant of the free DQS package, which was commercialized
by a German company that was recently bought by Sun.
- Run "qstat -f" to see the queues that were automatically
created. There should be one queue for each compute node.
The "states" column at far right is used for error flags.
- Use a text editor to create the file tachyon.job containing:
#$ -cwd
#$ -j y
/opt/mpich/bin/mpirun -v -np $NSLOTS -machinefile $TMPDIR/machines \
tachyon/compile/linux-mpich/tachyon +V tachyon/scenes/balls.dat
Notice the similarity to the command for running Tachyon
manually. SGE will create a temporary working directory
containing a machines file (list of nodes to run on) and set the
NSLOTS and TMPDIR environment variables automatically. The
options preceeded by #$ are parsed by SGE as if they were
specified on the command line. -cwd causes the job to execute in
the current working directory. -j y merges standard error and
output into a single file.
- Submit the job to run on three processors under the mpich
parallel environment with the command "qsub -pe mpich 3
tachyon.job". Note that there is no queue for the master
node, so we can't use 4 nodes.
- Use "qstat -f" to check on the job until it is scheduled,
then look for output files named tachyon.job.oX and
tachyon.job.poX, where X is the job number output by qsub. View
these files to see the output.
- Submit several jobs requesting 1, 2, and 3 processors in random
order so that a backlog develops. You can use the same
tachyon.job file for all of them, just use the up arrow, possibly
edit the processor request, and hit return to submit jobs
quickly. Use qstat to monitor how the jobs are executed (the
default scheduling policy is to take the earliest-submitted job
that can be run, i.e., for which enough processors are available,
and the scheduler runs at regular intervals).
- Use a text editor to create the file namd.job containing:
#$ -cwd
#$ -j y
nodefile=$TMPDIR/namd2.nodelist
echo group main > $nodefile
awk '{ for (i=0;i<$2;++i) {print "host",$1} }' $PE_HOSTFILE >> $nodefile
dir=$HOME/NAMD_2.6b1_Linux-i686
$dir/charmrun ++remote-shell /usr/bin/rsh ++nodelist $nodefile +p$NSLOTS $dir/namd2 ~/apoa1/apoa1.namd
Since NAMD does not use MPICH, we need a small shell script
and awk program to translate the SGE hostfile to charmrun format.
The second column of the hostfile is the number of processors available,
which is always one for these clusters, but this script will handle more.
- Submit the job with the command "qsub -pe make 3 namd.job".
Note that we are pretending to use the
make
parallel
environment, but we do not use any of the special files it sets
up.
- Use qstat to monitor the job until it starts running, the use
"tail -f namd.job.oX (X is the job number) to watch the
job output.
- When you get tired of this, Control-C out of tail and use
"qdel X" (X is the job number) to kill the job. Use qstat
to monitor the job until it is killed.
Part 4: There Is No Part 4
Compiling a program and running it under a queueing system is likely
all you will ever do on your cluster. We've done a typical
application (Tachyon) and a not-so-typical one (NAMD). At this
point you might want to rsh to a compute node to see what that
environment is like, or go see how the Clustermatic folks are doing.
If you're really ambitious, download your own code and see if it
comiles and runs.
See Also
Warewulf web site (http://www.warewulf.org/)
Grid Engine web site (http://gridengine.sunsource.net/)
NAMD web site (http://www.ks.uiuc.edu/Research/namd/)
Tachyon web site (http://jedi.ks.uiuc.edu/~johns/raytracer/)