namd leaves zombie processes on nodes?

From: JC Gumbart (gumbart_at_physics.gatech.edu)
Date: Mon May 23 2016 - 10:01:45 CDT

Next message: Jeff Comer: "Re: Adsorption energy of protein in vacuum"
Previous message: JC Gumbart: "Re: H-bonds in charmm36 with namd2.11"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Hi all,

We’ve run into a new issue on our cluster here. We find that jobs killed by the scheduler (torque) often don’t die (although sometimes they do!), but instead keep running. They are producing output as normal, but are not seen by the scheduler anymore. What our IT people can’t figure out is why it just started happening after a recent maintenance period - they said they didn’t change anything that should have affected this.

We’re running them using the command "mpirun -np $NP -env MV2_ENABLE_AFFINITY=0 namd2 $CONFFILE &> $LOGFILE”

Any suggestions?

Thanks!
JC

Next message: Jeff Comer: "Re: Adsorption energy of protein in vacuum"
Previous message: JC Gumbart: "Re: H-bonds in charmm36 with namd2.11"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]

This archive was generated by hypermail 2.1.6 : Tue Dec 27 2016 - 23:22:11 CST