Re: Re: NAMD job dies on 2-quad core server

From: vivek.viv.sharma_at_gmail.com
Date: Thu Apr 09 2009 - 01:44:01 CDT

Helllo Axel and all,

Well, it was my mistake to mention in my previous post that 'job dies'.
Well, in fact my 'this' statement was based on observing the 'top' command
on the console. The job is running fine on all the 8 processors. Its not
dying, but every now and then the 'top' command shows that none of the
processor is being used by NAMD. Like the NAMD process goes in background
and comes back running again with all 8 processors in use, as observed in
the 'top' command. When I check with 'ps -e' all the NAMD jobs are there.
Can anyone please throw some light on this, that why such a behaviour is
being observed, that NAMD jobs appear-go in background-re-appear in the top
command (?)

Now, am I right in thinking that 'this-way' running of jobs will take more
time than it should (?). I assume here that when 'top' comamnd does not
show NAMD running, simulation is not running. (could be wrong though, might
be something else is being done in this time).

Axel, thanks for your points, I should have observed more before posting.
Secondly, I really admire the beauty of this simple NAMD command which can
be used to run the simulation without much installation work to be done.

thanks and regards,

Vivek

On Apr 6, 2009 7:14pm, Axel Kohlmeyer <akohlmey_at_cmm.chem.upenn.edu> wrote:
> On Mon, 2009-04-06 at 04:45 +0000, vivek.viv.sharma_at_gmail.com wrote:

> > Hello everyone,

> >

> > We have recently bought a machine with the following configuration:

> >

> > 2 quad core processors each with 2.33GHz clock rate.

> > 8 GB RAM

> > 500GB total hard disk

> >

> > I have simply used the "NAMD_2.6_Linux-i686" binaries. And, started

> > the simulation (membrane protein with membrane, water, ions.). The

> > simulation starts fine with the command

> >

> > ./charmrun ++local +p 8 ./namd2 config.txt > config.log &

> >

> > But after 4390 steps the job dies, without giving any error message.

> > Would you please suggest what is happening? Do I need to install

> how should anybody know??? does your input run fine elsewhere? have

> you looked at the trajectory? have you looked at the machine logs?

> does your os have restrictive limits for interactive use or stack

> memory? can you run the same job with less processors? how is the

> CPU temperature? is the crash reproducable? ...

> this list can go on for much longer. so please keep in mind that

> the kind of suggestion you can receive from a mailing list is

> directly proportional to the kind and quality of information

> you provide. in you case, you just say "it doesn't work". and

> only for one specific configuration. that is _very_ little.

> > it(NAMD) from scratch?

> why? first you have to find out what happens.

> blind activism never helps!

> >

> > NAMD in log file shows clearly:

> >

> > >> Info: Running on 8 processors.

> >

> > I observed from 'top', indeed simulation runs on all 8 processors,

> > using more or less efficiently all the processors.

> there has to be more output, and i am pretty certain that there is

> some output that indicates what is going wrong.

> > All your suggestions will be very helpful.

> well, you got a ton of them already. the most important

> one is to include more relevant information. there are

> many, many cases on this mailing list where people ask

> for help with problems, and you can easily derive from

> the dialog what information is needed and what _you_

> can do beforehand to verify and you are seeing a real

> problem and what information is need to narrow it down.

> cheers,

> axel.

> > thanks and regards,

> >

> > Vivek

> > IMM, India.

> --

> =======================================================================

> Axel Kohlmeyer akohlmey_at_cmm.chem.upenn.edu http://www.cmm.upenn.edu

> Center for Molecular Modeling -- University of Pennsylvania

> Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323

> tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425

> =======================================================================

> If you make something idiot-proof, the universe creates a better idiot.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:52:35 CST