RE: Problems compiling NAMD

From: Philip Peartree (P.Peartree_at_postgrad.manchester.ac.uk)
Date: Wed Oct 08 2008 - 06:03:06 CDT

I have just compiled NAMD 2.6 with Charm++ 5.9 on Opteron with the
intel compilers, and I did get a segfault with the megatest program, I
got around this using -pthreads in the build line. It was the mpi
build though, not sure whether this will be of assistance.

Quoting "Axel Kohlmeyer" <akohlmey_at_cmm.chem.upenn.edu>:

> On Wed, 8 Oct 2008, Jesper Sørensen wrote:
>
> JS> Hi Alexander and Axel,
> JS>
> JS> Thank you both for the comments.
> JS>
> JS> Intel fails already at testing charm++ and this is just for the
> net-version,
> JS> not MPI.
> JS> I'm using build options:
> JS> > ./build charm++ net-linux-amd64 icc -no-shared -O- DCMK_OPTIMIZE=1
>
> please note that the compilation notes say to use:
>
> --no-shared -O -DCMK_OPTIMIZE=1
>
> JS> The test fails with:
> JS>
> JS> >./charmrun ./pgm +p1
> JS> >Megatest is running on 1 processors.
>
> hmmmm.... which version of charm++ are you
> trying to compile? is it the 5.9 version
> bundled with namd2.6?
>
> i just tried compiling a net version of charm++ on my
> desktop (intel icc 9.1.045, x86_64 cpu) and it works fine,
> but there i am the current charm++ cvs code.
>
> i found a copy of charm-5.9 on a different x86_64 machine
> that has intel 10.1.015. however when compiling and
> testing it, i get the same segmentation fault in test14
> of megatest. on the other hand compiling with gcc (v4.1.2)
> worked, after fixing a few inconsistencies in the code that
> gcc4 chokes on.
>
> so perhaps there is some code in this charm++ version that
> triggers a bug in the intel compiler or there is a bug
> in the code that is only exposed by intel compilers...
>
> hope this helps,
> axel.
>
>
> JS> >...
> JS> >test 14: initiated [tempotest (fang)]
> JS> >------------- Processor 0 Exiting: Caught Signal ------------
> JS> >Signal: segmentation violation
> JS> >Suggestion: Try running with '++debug', or linking with
> '-memory paranoid'.
> JS> >Stack Traceback:
> JS> > [0] /lib64/tls/libc.so.6 [0x335502e380]
> JS> > [1] [0x6bedd0]
> JS> >Fatal error on PE 0> segmentation violation
> JS> >make: *** [test] Error 1
> JS>
> JS> Do you guys or anybody else have a suggestion to what might be wrong?
> JS>
> JS> Kind regards,
> JS>
> JS> Jesper
> JS>
> JS>
> JS> -----Oprindelig meddelelse-----
> JS> Fra: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu]
> På vegne af
> JS> Axel Kohlmeyer
> JS> Sendt: 7. oktober 2008 18:54
> JS> Til: Jesper Soerensen
> JS> Cc: Alexander A. Vakhrushev; namd-l_at_ks.uiuc.edu
> JS> Emne: Re: namd-l: Problems compiling NAMD
> JS>
> JS> On Tue, 7 Oct 2008, Jesper Soerensen wrote:
> JS>
> JS> JS> Hi Alexander,
> JS> JS>
> JS> JS> I'm assuming that if I don't specify a compiler it defaults to gcc?
> JS> JS> Then yes I have made a version of gcc without MPI, but once
> I add MPI it
> JS> JS> fails. It seems to be that our MPI has been compiled with
> either intel
> JS> JS> or pgi compilers and so gcc fails because MPI calls some
> intel keywords.
> JS> JS>
> JS> JS> for example while running Make pgm:
> JS> JS>
> JS> JS> >/com/mpich-1.2.7p1/lib/libmpich.a(dmpipk.o)(.text+0x249):
> In function
> JS> JS> >`MPIR_UnPack_Hvector':
> JS> JS> >: undefined reference to `_intel_fast_memcpy'
> JS> JS>
> JS> JS> I can ask our sysadmin to make a gcc version of MPI and see if that
> JS> JS> helps. Would that be a good idea?
> JS>
> JS>
> JS> yes. one - easier to achieve - alternative, would be to reduce
> JS> the optimization level in the respective "arch" files. compilers
> JS> become increasingly unreliable with higher optimization levels.
> JS> with intel compilers, i found that a combination of flags like
> JS>
> JS> -O2 -unroll -march=pentiumpro -mtune=pentiumpro -pc64
> JS>
> JS> produces fast and reliably working executables on AMD cpus.
> JS>
> JS> i would also, in case you are still getting segmentation
> JS> faults, remove the -ip flag.
> JS>
> JS> cheers,
> JS> axel.
> JS>
> JS>
> JS>
> JS> JS>
> JS> JS> Kind regards,
> JS> JS> Jesper
> JS> JS>
> JS> JS>
> JS> JS>
> JS> JS> On Mon, 2008-10-06 at 21:00 +0500, Alexander A. Vakhrushev wrote:
> JS> JS> > Hi Jesper!
> JS> JS> >
> JS> JS> > Did you try just gcc version?
> JS> JS> >
> JS> JS> > 2008/10/6 Jesper Soerensen <jes_at_chem.au.dk>:
> JS> JS> > > Hi Alexander,
> JS> JS> > >
> JS> JS> > > I'm running CentOS 4.3. I've tried using both OpenMPI
> and MPICH but
> JS> both
> JS> JS> > > fail. I've tried:
> JS> JS> > > OpenMPI version 1.2.6
> JS> JS> > > MPICH version 1.2.7
> JS> JS> > > Using Intel compilers (icc & ifort) version 10.1.017
> JS> JS> > >
> JS> JS> > > I ran the megatest set from charm++ and I get a failure
> in test14:
> JS> JS> > >
> JS> JS> > >> ./mpirun ./pgm
> JS> JS> > >> ...
> JS> JS> > >> test 14: initiated [tempotest (fang)]
> JS> JS> > >> p0_29927: p4_error: interrupt SIGSEGV: 11
> JS> JS> > >
> JS> JS> > >
> JS> JS> > >> mpirun -np 4 -all-local ./pgm
> JS> JS> > >> Megatest is running on 4 processors.
> JS> JS> > >> ...
> JS> JS> > >> test 14: initiated [tempotest (fang)]
> JS> JS> > >> p2_30323: p4_error: interrupt SIGSEGV: 11
> JS> JS> > >> p3_30346: p4_error: Found a dead connection while looking for
> JS> JS> > > messages: 0
> JS> JS> > >> [jesper_at_fe1 megatest]$ p1_30297: p4_error: interrupt SIGx: 13
> JS> JS> > >> rm_l_2_30324: (10.011719) net_send: could not write to
> fd=5, errno
> JS> =
> JS> JS> > > 32
> JS> JS> > >> rm_l_3_30347: (7.667969) net_send: could not write to
> fd=5, errno =
> JS> 32
> JS> JS> > >> p2_30323: (12.027344) net_send: could not write to
> fd=5, errno = 32
> JS> JS> > >> p3_30346: (13.675781) net_send: could not write to
> fd=5, errno = 32
> JS> JS> > >> p1_30297: (18.843750) net_send: could not write to
> fd=5, errno = 32
> JS> JS> > >
> JS> JS> > > Does anybody recognize this?
> JS> JS> > >
> JS> JS> > > Kind regards,
> JS> JS> > >
> JS> JS> > > Jesper
> JS> JS> > >
> JS> JS> > >
> JS> JS> > >
> JS> JS> > >
> JS> JS> > > On Fri, 2008-10-03 at 22:44 +0500, Alexander A. Vakhrushev wrote:
> JS> JS> > >> Hi Jesper!
> JS> JS> > >>
> JS> JS> > >> What is platform of your cluster?
> JS> JS> > >>
> JS> JS> > >> 2008/10/3 Jesper Soerensen <jes_at_chem.au.dk>:
> JS> JS> > >> > Hi,
> JS> JS> > >> >
> JS> JS> > >> > I've just compiled NAMD on our cluster and this runs through
> JS> fine, but
> JS> JS> > >> > when I start a job I get the following error in the log file:
> JS> JS> > >> >>Info: Entering startup phase 8 with 134856 kB of
> memory in use.
> JS> JS> > >> >>Info: Finished startup with 143120 kB of memory in use.
> JS> JS> > >> >>1 additional process aborted (not shown)
> JS> JS> > >> >
> JS> JS> > >> > And the cluster job-error log says:
> JS> JS> > >> >>mpirun noticed that job rank 0 with PID 25232 on node s07n06
> JS> exited on
> JS> JS> > >> >>signal 11 (Segmentation fault).
> JS> JS> > >> >
> JS> JS> > >> > I am running a Linux-amd64-MPI-icc-ifort version if
> this helps.
> JS> Also, let
> JS> JS> > >> > me know if there is more information I can give to
> help solve the
> JS> JS> > >> > problem. I'm just wondering if anybody has seen this type of
> JS> error
> JS> JS> > >> > before.
> JS> JS> > >> >
> JS> JS> > >> > Kind regards,
> JS> JS> > >> >
> JS> JS> > >> > Jesper Soerensen
> JS> JS> > >> >
> JS> JS> > >> >
> JS> JS> > >> > --
> JS> JS> > >> > Jesper Sørensen, M.Sc.
> JS> JS> > >> > Ph.D.-student
> JS> JS> > >> > Biomodelling Group, inSPIN and iNANO centers
> JS> JS> > >> > Department of Chemistry
> JS> JS> > >> > University of Aarhus
> JS> JS> > >> > Langelandsgade 140
> JS> JS> > >> > 8000 Aarhus C
> JS> JS> > >> > Office: 1510-419
> JS> JS> > >> > Tlf. 89423385
> JS> JS> > >> > email: jes_at_chem.au.dk
> JS> JS> > >> > www: www.chem.au.dk/~biomodelling
> JS> JS> > >> >
> JS> JS> > >> >
> JS> JS> > >>
> JS> JS> > >>
> JS> JS> > >>
> JS> JS> > >
> JS> JS> > >
> JS> JS> >
> JS> JS> >
> JS> JS> >
> JS> JS>
> JS>
> JS>
>
> --
> =======================================================================
> Axel Kohlmeyer akohlmey_at_cmm.chem.upenn.edu http://www.cmm.upenn.edu
> Center for Molecular Modeling -- University of Pennsylvania
> Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
> tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
> =======================================================================
> If you make something idiot-proof, the universe creates a better idiot.

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:48:25 CST