Re: APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)

From: Jim Phillips (jim_at_ks.uiuc.edu)
Date: Thu Nov 13 2014 - 15:47:52 CST

I don't know why you're seeing errors, but I can strongly recommend not
using mpi-smp in general, and in particular not for CUDA builds, since the
performance is not good. You should use ibverbs builds (assuming you have
InfiniBand) instead. If you don't have InfiniBand you are unlikely to get
good multiple-node scaling.

If you want to track down your error you should at least confirm that the
non-CUDA version runs correctly with your Intel MPI library in mpi-smp.
We don't have any experience with Intel MPI so there may be some issue,
and as I said, there are not a lot of production runs with mpi-smp.

Jim

On Thu, 13 Nov 2014, Bin He wrote:

> Hi,
>
> I recompiled the namd, and got the version of mpi-smp-cuda version.
>
> source code:NAMD_2.10b1_Source.tar.gz
> compiler: intel icc (ICC) 14.0.0 20130728
> mpi:Intel(R) MPI Library for Linux* OS, Version 4.1 Update 1 Build 20130522
> OS:RHEL 6.2
>
>
> When I run;
>
> mpirun -ppn 1 -hosts node335,node334 -IB ./namd2 +ppn 8 +devices 0,1
> ../workload/f1atpase2000/f1atpase.namd
>
> There will be lots of error like:
>
> ERROR: Atom 321692 velocity is 1191.13 -3428.88 10552 (limit is 11000,
> atom 210 of 219 on patch 1249 pe 4)
>
> ERROR: Atoms moving too fast; simulation has become unstable (152 atoms on
> patch 1249 pe 4).
> When I run;
>
> mpirun -ppn 1 -hosts node335,node334 -IB ./namd2 +ppn 16 +devices 0,1
> ../workload/f1atpase2000/f1atpase.namd
>
> It will exit with "APPLICATION TERMINATED WITH THE EXIT STRING:
> Segmentation fault (signal 11)".
>
> So what is wrong?
>
> Thanks
>
>
>
>
>
> ------------------------
> Best Regards!
> Bin He
> Member of IT
> Unique Studio
> Room 811,Building LiangSheng,1037 Luoyu Road, Wuhan 430074,P.R. China
> ☎:(+86) 13163260252
> Weibo:何斌_HUST
> Email:binhe_at_hustunique.com
> Email:binhe22_at_gmail.com
>

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:23:00 CST