RE: Problems compiling NAMD

From: Jesper Soerensen (jes_at_chem.au.dk)
Date: Wed Oct 08 2008 - 06:23:16 CDT

Thanks Alex and Philip. It's all helpful...

I switched to the latest stable version of charm6.0 and then the intel10
compilers worked. There were specific options in charm6.0 for intel 10
compilers for IA and AMD, so I guess they probably know/knew of an
issue. There is a minor issue with smp no working, but that is not
important right now.

I'll try the 5.9 with pthreads as well, just to see what the difference
in performance is.

I think I can safely say that I believe the issue to be resolved now ;-)

Thanks again,

Jesper

On Wed, 2008-10-08 at 12:03 +0100, Philip Peartree wrote:
> I have just compiled NAMD 2.6 with Charm++ 5.9 on Opteron with the
> intel compilers, and I did get a segfault with the megatest program, I
> got around this using -pthreads in the build line. It was the mpi
> build though, not sure whether this will be of assistance.
>
>
> Quoting "Axel Kohlmeyer" <akohlmey_at_cmm.chem.upenn.edu>:
>
> > On Wed, 8 Oct 2008, Jesper Sørensen wrote:
> >
> > JS> Hi Alexander and Axel,
> > JS>
> > JS> Thank you both for the comments.
> > JS>
> > JS> Intel fails already at testing charm++ and this is just for the
> > net-version,
> > JS> not MPI.
> > JS> I'm using build options:
> > JS> > ./build charm++ net-linux-amd64 icc -no-shared -O- DCMK_OPTIMIZE=1
> >
> > please note that the compilation notes say to use:
> >
> > --no-shared -O -DCMK_OPTIMIZE=1
> >
> > JS> The test fails with:
> > JS>
> > JS> >./charmrun ./pgm +p1
> > JS> >Megatest is running on 1 processors.
> >
> > hmmmm.... which version of charm++ are you
> > trying to compile? is it the 5.9 version
> > bundled with namd2.6?
> >
> > i just tried compiling a net version of charm++ on my
> > desktop (intel icc 9.1.045, x86_64 cpu) and it works fine,
> > but there i am the current charm++ cvs code.
> >
> > i found a copy of charm-5.9 on a different x86_64 machine
> > that has intel 10.1.015. however when compiling and
> > testing it, i get the same segmentation fault in test14
> > of megatest. on the other hand compiling with gcc (v4.1.2)
> > worked, after fixing a few inconsistencies in the code that
> > gcc4 chokes on.
> >
> > so perhaps there is some code in this charm++ version that
> > triggers a bug in the intel compiler or there is a bug
> > in the code that is only exposed by intel compilers...
> >
> > hope this helps,
> > axel.
> >
> >
> > JS> >...
> > JS> >test 14: initiated [tempotest (fang)]
> > JS> >------------- Processor 0 Exiting: Caught Signal ------------
> > JS> >Signal: segmentation violation
> > JS> >Suggestion: Try running with '++debug', or linking with
> > '-memory paranoid'.
> > JS> >Stack Traceback:
> > JS> > [0] /lib64/tls/libc.so.6 [0x335502e380]
> > JS> > [1] [0x6bedd0]
> > JS> >Fatal error on PE 0> segmentation violation
> > JS> >make: *** [test] Error 1
> > JS>
> > JS> Do you guys or anybody else have a suggestion to what might be wrong?
> > JS>
> > JS> Kind regards,
> > JS>
> > JS> Jesper
> > JS>
> > JS>
> > JS> -----Oprindelig meddelelse-----
> > JS> Fra: owner-namd-l_at_ks.uiuc.edu [mailto:owner-namd-l_at_ks.uiuc.edu]
> > På vegne af
> > JS> Axel Kohlmeyer
> > JS> Sendt: 7. oktober 2008 18:54
> > JS> Til: Jesper Soerensen
> > JS> Cc: Alexander A. Vakhrushev; namd-l_at_ks.uiuc.edu
> > JS> Emne: Re: namd-l: Problems compiling NAMD
> > JS>
> > JS> On Tue, 7 Oct 2008, Jesper Soerensen wrote:
> > JS>
> > JS> JS> Hi Alexander,
> > JS> JS>
> > JS> JS> I'm assuming that if I don't specify a compiler it defaults to gcc?
> > JS> JS> Then yes I have made a version of gcc without MPI, but once
> > I add MPI it
> > JS> JS> fails. It seems to be that our MPI has been compiled with
> > either intel
> > JS> JS> or pgi compilers and so gcc fails because MPI calls some
> > intel keywords.
> > JS> JS>
> > JS> JS> for example while running Make pgm:
> > JS> JS>
> > JS> JS> >/com/mpich-1.2.7p1/lib/libmpich.a(dmpipk.o)(.text+0x249):
> > In function
> > JS> JS> >`MPIR_UnPack_Hvector':
> > JS> JS> >: undefined reference to `_intel_fast_memcpy'
> > JS> JS>
> > JS> JS> I can ask our sysadmin to make a gcc version of MPI and see if that
> > JS> JS> helps. Would that be a good idea?
> > JS>
> > JS>
> > JS> yes. one - easier to achieve - alternative, would be to reduce
> > JS> the optimization level in the respective "arch" files. compilers
> > JS> become increasingly unreliable with higher optimization levels.
> > JS> with intel compilers, i found that a combination of flags like
> > JS>
> > JS> -O2 -unroll -march=pentiumpro -mtune=pentiumpro -pc64
> > JS>
> > JS> produces fast and reliably working executables on AMD cpus.
> > JS>
> > JS> i would also, in case you are still getting segmentation
> > JS> faults, remove the -ip flag.
> > JS>
> > JS> cheers,
> > JS> axel.
> > JS>
> > JS>
> > JS>
> > JS> JS>
> > JS> JS> Kind regards,
> > JS> JS> Jesper
> > JS> JS>
> > JS> JS>
> > JS> JS>
> > JS> JS> On Mon, 2008-10-06 at 21:00 +0500, Alexander A. Vakhrushev wrote:
> > JS> JS> > Hi Jesper!
> > JS> JS> >
> > JS> JS> > Did you try just gcc version?
> > JS> JS> >
> > JS> JS> > 2008/10/6 Jesper Soerensen <jes_at_chem.au.dk>:
> > JS> JS> > > Hi Alexander,
> > JS> JS> > >
> > JS> JS> > > I'm running CentOS 4.3. I've tried using both OpenMPI
> > and MPICH but
> > JS> both
> > JS> JS> > > fail. I've tried:
> > JS> JS> > > OpenMPI version 1.2.6
> > JS> JS> > > MPICH version 1.2.7
> > JS> JS> > > Using Intel compilers (icc & ifort) version 10.1.017
> > JS> JS> > >
> > JS> JS> > > I ran the megatest set from charm++ and I get a failure
> > in test14:
> > JS> JS> > >
> > JS> JS> > >> ./mpirun ./pgm
> > JS> JS> > >> ...
> > JS> JS> > >> test 14: initiated [tempotest (fang)]
> > JS> JS> > >> p0_29927: p4_error: interrupt SIGSEGV: 11
> > JS> JS> > >
> > JS> JS> > >
> > JS> JS> > >> mpirun -np 4 -all-local ./pgm
> > JS> JS> > >> Megatest is running on 4 processors.
> > JS> JS> > >> ...
> > JS> JS> > >> test 14: initiated [tempotest (fang)]
> > JS> JS> > >> p2_30323: p4_error: interrupt SIGSEGV: 11
> > JS> JS> > >> p3_30346: p4_error: Found a dead connection while looking for
> > JS> JS> > > messages: 0
> > JS> JS> > >> [jesper_at_fe1 megatest]$ p1_30297: p4_error: interrupt SIGx: 13
> > JS> JS> > >> rm_l_2_30324: (10.011719) net_send: could not write to
> > fd=5, errno
> > JS> =
> > JS> JS> > > 32
> > JS> JS> > >> rm_l_3_30347: (7.667969) net_send: could not write to
> > fd=5, errno =
> > JS> 32
> > JS> JS> > >> p2_30323: (12.027344) net_send: could not write to
> > fd=5, errno = 32
> > JS> JS> > >> p3_30346: (13.675781) net_send: could not write to
> > fd=5, errno = 32
> > JS> JS> > >> p1_30297: (18.843750) net_send: could not write to
> > fd=5, errno = 32
> > JS> JS> > >
> > JS> JS> > > Does anybody recognize this?
> > JS> JS> > >
> > JS> JS> > > Kind regards,
> > JS> JS> > >
> > JS> JS> > > Jesper
> > JS> JS> > >
> > JS> JS> > >
> > JS> JS> > >
> > JS> JS> > >
> > JS> JS> > > On Fri, 2008-10-03 at 22:44 +0500, Alexander A. Vakhrushev wrote:
> > JS> JS> > >> Hi Jesper!
> > JS> JS> > >>
> > JS> JS> > >> What is platform of your cluster?
> > JS> JS> > >>
> > JS> JS> > >> 2008/10/3 Jesper Soerensen <jes_at_chem.au.dk>:
> > JS> JS> > >> > Hi,
> > JS> JS> > >> >
> > JS> JS> > >> > I've just compiled NAMD on our cluster and this runs through
> > JS> fine, but
> > JS> JS> > >> > when I start a job I get the following error in the log file:
> > JS> JS> > >> >>Info: Entering startup phase 8 with 134856 kB of
> > memory in use.
> > JS> JS> > >> >>Info: Finished startup with 143120 kB of memory in use.
> > JS> JS> > >> >>1 additional process aborted (not shown)
> > JS> JS> > >> >
> > JS> JS> > >> > And the cluster job-error log says:
> > JS> JS> > >> >>mpirun noticed that job rank 0 with PID 25232 on node s07n06
> > JS> exited on
> > JS> JS> > >> >>signal 11 (Segmentation fault).
> > JS> JS> > >> >
> > JS> JS> > >> > I am running a Linux-amd64-MPI-icc-ifort version if
> > this helps.
> > JS> Also, let
> > JS> JS> > >> > me know if there is more information I can give to
> > help solve the
> > JS> JS> > >> > problem. I'm just wondering if anybody has seen this type of
> > JS> error
> > JS> JS> > >> > before.
> > JS> JS> > >> >
> > JS> JS> > >> > Kind regards,
> > JS> JS> > >> >
> > JS> JS> > >> > Jesper Soerensen
> > JS> JS> > >> >
> > JS> JS> > >> >
> > JS> JS> > >> > --
> > JS> JS> > >> > Jesper Sørensen, M.Sc.
> > JS> JS> > >> > Ph.D.-student
> > JS> JS> > >> > Biomodelling Group, inSPIN and iNANO centers
> > JS> JS> > >> > Department of Chemistry
> > JS> JS> > >> > University of Aarhus
> > JS> JS> > >> > Langelandsgade 140
> > JS> JS> > >> > 8000 Aarhus C
> > JS> JS> > >> > Office: 1510-419
> > JS> JS> > >> > Tlf. 89423385
> > JS> JS> > >> > email: jes_at_chem.au.dk
> > JS> JS> > >> > www: www.chem.au.dk/~biomodelling
> > JS> JS> > >> >
> > JS> JS> > >> >
> > JS> JS> > >>
> > JS> JS> > >>
> > JS> JS> > >>
> > JS> JS> > >
> > JS> JS> > >
> > JS> JS> >
> > JS> JS> >
> > JS> JS> >
> > JS> JS>
> > JS>
> > JS>
> >
> > --
> > =======================================================================
> > Axel Kohlmeyer akohlmey_at_cmm.chem.upenn.edu http://www.cmm.upenn.edu
> > Center for Molecular Modeling -- University of Pennsylvania
> > Department of Chemistry, 231 S.34th Street, Philadelphia, PA 19104-6323
> > tel: 1-215-898-1582, fax: 1-215-573-6233, office-tel: 1-215-898-5425
> > =======================================================================
> > If you make something idiot-proof, the universe creates a better idiot.
>
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:49:57 CST