Re: load balancer, athlon 64 dual core

From: Leandro Martínez (leandromartinez98_at_gmail.com)
Date: Wed Sep 13 2006 - 08:00:53 CDT

Mi simulations run for one or two days before halting. This is not
deterministic at all. I will try this linux test project to see if something
happens.
Thanks,
Leandro.

On 9/13/06, Cesar Luis Avila <cavila_at_fbqf.unt.edu.ar> wrote:
>
> Hi Leandro,
> For how long does your simulation run until it halts? I have been
> running NAMD 2.6 for more than 2 days now without any problem. Perhaps
> your problem might be related to a bug or some incompatibility in your
> libraries.
> Have you tried running the Linux Test Project on your machines?
> http://ltp.sourceforge.net/. It is a large set of test developed to
> validate the reliability, robustness, and stability of Linux.
>
> Cesar
>
> Leandro Martínez escribió:
> >
> > Hi all,
> > I'm still having problems in running namd2 in our Athlon 64 Dual Core
> > machines. The problem is that the simulation runs well to a point where
> > all processes, except for one, stop, and I get a single process in a
> > single cpu running. The simulation does not crash, but it does not
> > continues as well, and this single process appears to last forever
> > doing something I don't know what it is.
> >
> > Now, as Jim suggested, I have attached gdb to this process. I have
> > never used it, but I could get the information bellow. Any help is
> > appreciated. I believe the bolded output bellow is the one referring
> > to the namd2 process.
> >
> > ------------ OUTPUT FROM GDB: --------------------------
> >
> > Attaching to program: /usr/bin/namd2, process 19438
> > Reading symbols from /lib64/libdl.so.2...(no debugging symbols
> > found)...done.
> > Loaded symbols for /lib64/libdl.so.2
> > Reading symbols from /lib64/libm.so.6...(no debugging symbols
> > found)...done.
> > Loaded symbols for /lib64/libm.so.6
> > Reading symbols from /usr/lib64/libstdc++.so.6...(no debugging symbols
> > found)...done.
> > Loaded symbols for /usr/lib64/libstdc++.so.6
> > Reading symbols from /lib64/libc.so.6...
> > (no debugging symbols found)...done.
> > Loaded symbols for /lib64/libc.so.6
> > Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging
> > symbols found)...done.
> > Loaded symbols for /lib64/ld-linux-x86-64.so.2
> > Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols
> > found)...done.
> > Loaded symbols for /lib64/libgcc_s.so.1
> > Reading symbols from /lib64/libnss_files.so.2...
> > (no debugging symbols found)...done.
> > Loaded symbols for /lib64/libnss_files.so.2
> > 0x0000000000714caa in Set::find ()
> > (gdb) next
> > Single stepping until exit from function _ZN3Set4findEP10InfoRecord,
> > which has no line number information.
> > 0x00000000006f407f in Rebalancer::numAvailable ()
> > (gdb) next
> > Single stepping until exit from function
> > _ZN10Rebalancer12numAvailableEP11computeInfoP13processorInfoPiS4_S4_,
> > which has no line number information.
> > 0x00000000006f3f34 in Rebalancer::refine_togrid ()
> > (gdb) next
> > Single stepping until exit from function
> >
> _ZN10Rebalancer13refine_togridERA3_A3_A2_NS_6pcpairEdP13processorInfoP11computeInfo,
> >
> > which has no line number information.
> > 0x00000000006f23b5 in Rebalancer::refine ()
> > (gdb) next
> > Single stepping until exit from function _ZN10Rebalancer6refineEv,
> > which has no line number information.
> > -----------------------------------------------------------------------
> > From this point on nothing happens.
> >
> > Thank you very much,
> > Leandro.
> >
> >
> >
> >
> > --------------------------------------------------------------------
> > Leandro Martinez
> > Institute of Chemistry
> > State University of Campinas, Brazil
> > http://www.ime.unicamp.br/~martinez/packmol
> > <http://www.ime.unicamp.br/%7Emartinez/packmol>
> > --------------------------------------------------------------------
> >
> >
> >
> >
> >
>
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:44:00 CST