Re: Job script for multi node job

From: Subbarao Kanchi (ksubbu85_at_gmail.com)
Date: Fri Feb 14 2014 - 05:22:55 CST

Hi Norman Geist,
                          Thank you for quick reply. hear is the nodelist
file. we have 32 procs in each node and job went to node 3 and node 6 but
it is not using both with the above script.

Regards,
subbu.

group main
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-3
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6
host compute-0-6

On Fri, Feb 14, 2014 at 4:42 PM, Norman Geist <
norman.geist_at_uni-greifswald.de> wrote:

> What is the content of "nodelist" after script ran?
>
>
>
> *Von:* owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu] *Im
> Auftrag von *Subbarao Kanchi
> *Gesendet:* Freitag, 14. Februar 2014 11:20
> *An:* namd-l_at_ks.uiuc.edu
> *Betreff:* namd-l: Job script for multi node job
>
>
>
> Dear All,
>
> I am using compiled version of NAMD_2.9_Linux-x86_64-ibverbs.
> The following job script is working for a single node but if I submit with
> two/more nodes,job is running but using only one node and not using other
> nodes. I am giving the script below and I do not able to figure the mistake
> in the script. I will appreciate any suggestions.
>
>
>
> Regards,
>
> Subbu.
>
>
>
>
>
>
>
>
>
>
>
> #!/bin/csh -f
>
> #PBS -l nodes=2:ppn=32
>
> #PBS -o /present_working_dir/out.out
>
> #PBS -e /present_working_dir/err.out
>
> #PBS -N test
>
>
>
> cd $PBS_O_WORKDIR
>
> cat $PBS_NODEFILE > temp.1
>
> set nprocs = `wc -l < $PBS_NODEFILE`
>
> echo $nprocs
>
> setenv DO_PARALLEL "/home/NAMD_2.9_Linux-x86_64-ibverbs/charmrun
> ++remote-shell ssh ++nodelist nodelist +p$nprocs "
>
> setenv exc "/home/NAMD_2.9_Linux-x86_64-ibverbs/namd2 "
>
> set j="e"
>
> set tot=0
>
> echo group main >> nodelist
>
> foreach i ( `cat temp.1` )
>
> echo host $i >> nodelist
>
> if ( $j != $i ) then
>
> ssh -n $i mkdir -p /temp1_dir
>
> ssh -n $i cp -r /present_working_dir/* /temp1_dir
>
> echo "$i " >> t2
>
> ssh -n $i limit >> t2
>
> ssh -n $i limit memorylocked unlimited
>
>
>
> endif
>
> set j="$i"
>
> end
>
> cd /temp1_dir
>
>
>
> $DO_PARALLEL $exc namd.conf > namd.log
>
>
>
>
> ------------------------------
> <http://www.avast.com/>
>
> Diese E-Mail ist frei von Viren und Malware, denn der avast! Antivirus<http://www.avast.com/>Schutz ist aktiv.
>
>

This archive was generated by hypermail 2.1.6 : Thu Dec 31 2015 - 23:20:28 CST