You should have the following parts:
linux text
at this point to do a text based install.
The process is very similar to the graphical install. 10.0.4.1
in the IP address Field and
255.255.255.0
in the Netmask field. Click
"OK" after you have made these changes.
and select "Next" to continue.Gateway: 130.126.120.1 Primary DNS: 130.126.120.32 Secondary DNS: 130.126.120.33 Tertiary DNS: 130.126.116.194
ping -c 1 130.126.120.32
". If you get a
response, move on to the next step. If not, swap the two network
cables on the back of the master node and repeat. If you still
get no response, ask one of the instructors for assistance. (different programs refer to HTTP_PROXY and http_proxy, so we set them both).# echo 'export HTTP_PROXY=http://login.ks.uiuc.edu:3128' >> ~/.bashrc # echo 'export http_proxy=$HTTP_PROXY' >> ~/.bashrc # source ~/.bashrc
/etc/yum.repos.d/dag.repo
:
[dag] name=Dag RPM Repository for Red Hat Enterprise Linux baseurl=http://apt.sw.be/redhat/el4/en/i386/dag gpgcheck=1 enabled=1 gpgkey=http://dag.wieers.com/packages/RPM-GPG-KEY.dag.txt
# yum -y install perl-Unix-Syslog dhcp tftp-server tftp createrepo rpm-build # yum -y groupinstall "Development Tools"
yum -y update
". # mkdir /downloads # cd /downloads
Note: these files can also be found on the Cluster Workshop CD if you're experiencing network problems.# wget http://www.warewulf-cluster.org/downloads/dependancies/perl-Term-Screen-1.02-3.caos.src.rpm # wget http://www.warewulf-cluster.org/downloads/releases/2.6.1/warewulf-2.6.1-2.src.rpm # wget http://warewulf.lbl.gov/downloads/addons/pdsh/pdsh-2.3-10.caos.src.rpm # rpmbuild --rebuild perl-Term-Screen-1.02-3.caos.src.rpm # rpmbuild --rebuild warewulf-2.6.1-2.src.rpm # rpmbuild --rebuild pdsh-2.3-10.caos.src.rpm
# createrepo /usr/src/redhat/RPMS
/etc/yum.repos.d/local.repo
, containing:
[local] name=Local repository baseurl=file:///usr/src/redhat/RPMS gpgcheck=0
# yum -y install perl-Term-Screen warewulf warewulf-tools pdsh
/usr/share/warewulf/vnfs-scripts/centos4-vnfs.sh
around line 104. This script helps generate the virtual
filesystem for the nodes, so that they'll know about this
repository as well. If this command results in many errors about files not existing, run the command "# /usr/share/warewulf/vnfs-scripts/centos4-vnfs.sh
mkdir -p /vnfs/centos-4/var/lock/rpm
"
and repeat this step. # wwvnfs --build --hybrid
# yum -y remove iptables krb5-workstation
Yes, Warewulf is set up correctly. 10.128.0.0 is fine for DHCP. Yes, this configuration looks correct. Yes, you should configure DHCP now. Yes, you should configure NFS now. Yes, you should configure TFTP now. Yes, you should configure syslog.# wwinit --init
/etc/hosts
and
change the third line to remove any mentions of your hostname for
127.0.0.1 (keep the localhost entries there). /etc/warewulf/nodes/new/node0000
. Move this file to
/etc/warewulf/nodes/nodegroup1/
/etc/init.d/warewulf restart
And then reboot the nodes.# cp /etc/passwd /vnfs/default/etc/passwd # cp /etc/group /vnfs/default/etc/group # wwvnfs --build --hybrid
passwd
and group
files without including the user account that we created during
the install. This step is only necessary to mirror that user
account to the nodes, and any further users created with
"adduser" will be properly propagated. Again, this file can be found on the Cluster Workshop CD as well.# cd /downloads # wget http://www-unix.mcs.anl.gov/mpi/mpich/downloads/mpich.tar.gz # tar xvfz mpich.tar.gz # cd mpich-1.2.7p1 # ./configure --prefix=/opt/mpich # make # make install
tar xzf /media/cdrom/sge-6.0u6-bin-lx24-x86.tar.gz tar xzf /media/cdrom/sge-6.0u6-common.tar.gz
in the appropriate place. Save the file.sge_qmaster 536/tcp sge_execd 537/tcp
# cd $SGE_ROOT # ./install_qmaster
/home/sgeadmin
. sge_qmaster
and sge_engine
in a previous step, we should be able
to hit Return through the next two prompts. 20000-20100
unless you have a reason to do
otherwise. . /home/sgeadmin/default/common/settings.sh
to set some environment variables up. Note that you should add
this line to /etc/profile
so it's run everytime
anyone logs on. Also be sure to run chmod a+rx
/home/sgeadmin
so that all of your users have access
to it. /etc/warewulf/master.conf
, change the default
network to admin. Restart warewulf with
/etc/init.d/warewulf restart
qconf -as
hostname
to add it to the "submit host list."
# yum -y --installroot=/vnfs/default/ install binutils # echo 'sge_qmaster 536/tcp' >> /vnfs/default/etc/services # echo 'sge_execd 537/tcp' >> /vnfs/default/etc/services # cp /home/sgeadmin/default/common/sgeexecd /vnfs/default/etc/init.d/ # echo '. /home/sgeadmin/default/common/settings.sh' >> /vnfs/default/etc/profile # cp -d /usr/lib/libstdc++.so.5* /vnfs/default/usr/lib/ # chroot /vnfs/default/ /sbin/ldconfig
/opt
is available on the
slave nodes, so we set it up as an NFS export on the master node,
and put an entry into the slave nodes' fstab to automount it.
First, edit /etc/exports
, and copy the
/vnfs
line to a new entry. Change
/vnfs
to /opt
, leaving the rest the
same and save. Restart the NFS daemon by running:
Then edit# /etc/init.d/nfs restart
/vnfs/default/etc/fstab
and add this line
to the bottom (replacing hostname with your master's
hostname):
hostname.ks.uiuc.edu-sharedfs:/opt /opt nfs nfsvers=2 0 0
qconf -ap
mpich
, and enter the following settings:
Type "pe_name mpich slots 9999 user_lists NONE xuser_lists NONE start_proc_args /home/sgeadmin/mpi/startmpi.sh -catch_rsh $pe_hostfile stop_proc_args /home/sgeadmin/mpi/stopmpi.sh allocation_rule $round_robin control_slaves TRUE job_is_first_task FALSE urgency_slots min
:wq
" to write the file and quit (for
non-vi users). qconf
-mq all.q
, and edit the hostlist
to
contain "node0000 node0001 node0002
", change the
pe_list
to contain "make mpich
" in
order to work with our new parallel environment, and also change
the shell to "/bin/bash
", since Warewulf doesn't
have csh. Write and exit as you did before. pdsh
to
use, so we can easily run commands on all nodes at once:
You can test that this works with# wwlist -qr > /etc/hosts.pdsh
pdsh -a
hostname
Use "# wwvnfs --build --hybrid # wwnodes --sync # pdsh -a /sbin/reboot
wwtop
" to watch the nodes shutdown and come
back up. When they're all back up, press "q" and
continue. You can use "# pdsh -a /sbin/chkconfig sgeexecd on # pdsh -a /etc/init.d/sgeexecd start
qstat -f
" to make sure you
have your queue (all.q) set up with your three nodes. wwlist
- list available nodes wwstats
- print statistics on your nodes wwtop
- analogous to regular "top", but for the
entire cluster wwsummary
- a very brief summary of how your cluster
is generally doing pdsh
- run a command on multiple nodes in parallel
Warewulf web site (www.warewulf-cluster.org)