Re: Illegal instruction signal at startup. (with net-rs6k smp)

From: Gengbin Zheng (gzheng_at_ks.uiuc.edu)
Date: Thu Apr 08 2004 - 17:36:49 CDT

Hi Hansang,

 It seems that there is some problem with the new buildin gnu malloc of
Charm++. Please try if this could fix it:

edit charm/net-rs6k-smp/tmp/conv-mach-smp.h, add this:

#undef CMK_MALLOC_USE_GNU_MALLOC
#undef CMK_MALLOC_USE_OS_BUILTIN
#define CMK_MALLOC_USE_OS_BUILTIN 1

Do a clean make (make clean, and make charm++ OPTS=-g)
and re-link namd2.

Please let me know if this works or not,

Gengbin

On Thu, 8 Apr 2004, Gengbin Zheng wrote:

>
> I see. Could you send me your command line options to get this crash?
> I supposed this is alanin.
>
> Gengbin
>
>
> On Thu, 8 Apr 2004, Hansang Bae wrote:
>
> > Of course, I compiled this version with -g option, and Other versions,
> > net-rs6k and mpi-sp do not have any problem. I'm using tcl-8.4.4 and
> > fftw-2.1.5.
> >
> > Thanks,
> > Hansang Bae
> > 1285 EE Building, Mail Box #58
> > West Lafayette, IN 47907-1285
> > (H) 765-496-4729
> > (L) 765-494-3550 (EE 347)
> >
> > On Thu, 8 Apr 2004, Gengbin Zheng wrote:
> >
> > >
> > >
> > > It is a little hard to find out anything wrong here. I would suggest build
> > > your own binary (there may be binary or library incompatibility problem).
> > > For more options, you can try net-rs6k (without smp) or MPI version
> > > like mpi-sp|IBM-SP.
> > >
> > > Gengbin
> > >
> > > On Tue, 6 Apr 2004, Hansang Bae wrote:
> > >
> > > > I have a problem running the AIX-RS6000-SMP version with multiple threads.
> > > > It crashes generating illegal instruction exception at startup phase.
> > > > Strange thing is sometimes this doesn't happen.
> > > >
> > > > Here is "some" information from dbx log.
> > > >
> > > > ...
> > > > Info: ****************************
> > > > Info: STRUCTURE SUMMARY:
> > > > Info: 66 ATOMS
> > > > Info: 65 BONDS
> > > > Info: 96 ANGLES
> > > > Info: 31 DIHEDRALS
> > > > Info: 32 IMPROPERS
> > > > Info: 0 EXCLUSIONS
> > > > Info: 195 DEGREES OF FREEDOM
> > > > Info: 55 HYDROGEN GROUPS
> > > > Info: TOTAL MASS = 783.886 amu
> > > > Info: TOTAL CHARGE = 8.19564e-08 e
> > > > Info: *****************************
> > > > [20] stopped in suspend() at line 153 in file "BackEnd.cc" ($t1)
> > > > 153 CsdScheduler(-1);
> > > > (dbx) s
> > > > Info: Entering startup phase 0 with 3804 kB of memory in use.
> > > > Info: Entering startup phase 1 with 3804 kB of memory in use.
> > > >
> > > > Illegal instruction in . at 0x0 ($t2)
> > > > 0x00000000 00000000 Invalid opcode.
> > > > (dbx) where
> > > > warning: could not locate trace table from starting address 0x0
> > > > CmiHandleMessage(0x305d0a08) at 0x10011b38
> > > > CsdScheduleForever() at 0x10012be4
> > > > CsdScheduler(0xffffffff) at 0x10012d0c
> > > > slave_init(int,char**)(argc = 3, argv = 0x3027b6d8), line 94 in
> > > > "BackEnd.cc"
> > > > ConverseRunPE(0x0) at 0x1000c96c
> > > > call_startfn(0x1) at 0x1000b810
> > > > _pthread_body(??) at 0xd004b3fc
> > > >
> > > >
> > > >
> > > > Thanks,
> > > > Hansang Bae
> > > >
> > >
> > >
> >
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:37:31 CST