Re: Re: segfaults in mm_malloc

From: Niraj kumar (niraj17_at_gmail.com)
Date: Thu Jun 30 2005 - 00:25:18 CDT

Hi,

The machine is x86 ( IBM x445 running Fedora Core 3 ) . It is 8 cpu SMP .
[niraj_at_x445 ~]$ uname -a
Linux x445 2.6.12-rc6 #4 SMP Wed Jun 29 07:27:45 PDT 2005 i686 i686
i386 GNU/Linux

I am using mpich-1.2.6 which was built to use shared memory like this :
./configure --with-device=ch_shmem --enable-sharedlib

I built charm++ as follows :
./build charm++ mpi-linux -O -DCMK_OPTIMIZE=1

and then for NAMD:
./config tcl fftw plugins Linux-i686-MPI
cd Linux-i686-MPI
make
 
Everything was compiled with gcc 3.3.4 .

I have this crash happens more for 8 cpu runs which I run as follows:
charmrun +p8 <path_to_namd> <path_to_apoa1.namd>

For 2 or 4 cpus runs , the crash doesn't happen (or probably happens
very rarely) .

Hope this information is useful .
Let me know if you need any more info .

Regards
Niraj

On 6/29/05, David Kunzman <kunzman2_at_uiuc.edu> wrote:
> Can you give me more details? What is the architecture of
> your system (x86, powerPC, etc.)? What were the build options
> that you used for both namd and charm? etc.
>
> Have a Good Day,
> Dave Kunzman
>
>
> ---- Original message ----
> >Date: Wed, 29 Jun 2005 16:18:20 +0530
> >From: Niraj kumar <niraj17_at_gmail.com>
> >Subject: Re: namd-l: Re: segfaults in mm_malloc
> >To: David Kunzman <kunzman2_at_uiuc.edu>
> >Cc: Brian Bennion <brian_at_youkai.llnl.gov>, tim_at_scalex86.org,
> namd-l_at_ks.uiuc.edu
> >
> >Hi Dave ,
> >
> >We are seeing this problem on a 32 bit system .
> >Is there any reason (in your opinion) for this to appear on
> 32-bit ?
> >
> >I just tested the latest code from NAMD (from cvs) and
> charm++ (from nightly
> >build) and it is still there . The stack trace (from core)
> looks like this :
> >
> >(gdb) where
> >#0 0x0823ef18 in _int_malloc ()
> >#1 0x0823e535 in mm_malloc ()
> >#2 0x0824009d in malloc ()
> >#3 0x08240296 in malloc_nomigrate ()
> >#4 0x082a43f8 in CmiAlloc ()
> >#5 0x082a200c in PumpMsgs ()
> >#6 0x082a21ed in CmiGetNonLocal ()
> >#7 0x082a37e1 in CsdNextMessage ()
> >#8 0x082a38ac in CsdScheduleForever ()
> >#9 0x082a384f in CsdScheduler ()
> >#10 0x080d4884 in BackEnd::init ()
> >#11 0x080d1961 in main ()
> >
> >
> >Regards
> >Niraj
> >
> >On 6/28/05, David Kunzman <kunzman2_at_uiuc.edu> wrote:
> >> There does not seem to be a prototype for this function
> (and a few
> >> others). As a result, the compiler is assuming the return
> type of the
> >> function is an "int". When the compiler (icc in the case
> we were
> >> looking at) does the cast, it tries to convert the "int"
> that was
> >> returned (which is really a 64-bit pointer) into a "char*".
> Since the
> >> compiler "thinks" the returned value is only 32-bits, it is
> sign-extends
> >> the 32-bit value to fit the 64-bit register which wipes out
> the upper
> >> 32-bits of the pointer making it invalid. A fix should be
> checked in soon.
> >>
> >> Dave Kunzman
> >>
> >>
> >> Brian Bennion wrote:
> >>
> >> >Hi Tim,
> >> >I saw your posting on the namd wiki and want you to know
> that I to have
> >> >seen this problem or one very similar in mm_mallac. David
> Kunzman
> >> >(charm++ developer) worked on it for a couple of days last
> week.
> >> >
> >> >I do not know what the final results were, other than the
> compiler makes
> >> >some default assumptions about casting a void * to a char
> *. It instead
> >> >casts it to an int and wipes out half of the memory
> address. So what is
> >> >actually returned is entirely bogus.
> >> >
> >> >What compiler are you using?
> >> >
> >> >Brian
> >> >
> >> >
> >> > ************************************************
> >> > Brian Bennion, Ph.D.
> >> > Bioscience Directorate
> >> > Lawrence Livermore National Laboratory
> >> > P.O. Box 808, L-448 bennion1_at_llnl.gov
> >> > 7000 East Avenue phone: (925) 422-5722
> >> > Livermore, CA 94550 fax: (925) 424-6605
> >> >************************************************
> >> >
> >> >
> >> >
> >>
>

-- 
-----------------------------------------------------------------
http://www.nirajkumar.net

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:40:54 CST