Re: NAMD 2.8 on Cray XE6 segfaulting

From: Jim Phillips (jim_at_ks.uiuc.edu)
Date: Tue Jul 26 2011 - 11:58:34 CDT

Make that -DNOHOSTNAME -DNO_GETPWUID or DCD header writing will fail.

-Jim

On Tue, 26 Jul 2011, Jim Phillips wrote:

> Hi,
>
> Add -DNOHOSTNAME to the CXX definition in CRAY-XT-g++.arch (see
> http://www.ks.uiuc.edu/Research/namd/cvs2html/CRAY-XT-g++.arch_arch_diff_1.6_1.5.html)
> and use the old Tcl 8.3.3 library from
> http://www.ks.uiuc.edu/Research/namd/libraries/tcl-linux-amd64.tar.gz
>
> -Jim
>
>
> On Tue, 26 Jul 2011, Tim Robinson wrote:
>
>> Dear Cray XE6 owners/users
>>
>> I am having trouble getting NAMD 2.8 to run on Cray XE6 (2.7 was no
>> problem). I have tried with charm-6.3.2 and with charm-6.2.2.
>>
>> The basic steps are:
>>
>> ./build charm++ mpi-crayxt --no-build-shared --with-production
>> ./config CRAY-XT-g++
>> make
>>
>> (I am using gcc/4.5.2 and fftw/2.1.5.2)
>>
>> The executable crashes very soon after launch:
>>
>> Charm++> Running on MPI version: 2.2 multi-thread support: 0 (max
>> supported: -1)
>> Charm++> Running on 11 unique compute nodes (24-way SMP).
>> Info: NAMD 2.8 for CRAY-XT-MPI
>> Info:
>> Info: Please visit http://www.ks.uiuc.edu/Research/namd/
>> Info: for updates, documentation, and support information.
>> Info:
>> Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
>> Info: in all publications reporting results obtained with NAMD.
>> Info:
>> Info: Based on Charm++/Converse 60202 for mpi-crayxt
>> Info: Built Tue Jul 26 12:08:28 CEST 2011 by robinson on palu2
>> [0] Stack Traceback:
>> [0:0] [0xb89f10]
>> [125] Stack Traceback:
>> [125:0] [0xb89f10]
>> [125:1] [0xb89ebb]
>> [125:2] [0xbee6a3]
>> [125:3] [0xb15f70]
>> [125:4] [0xac55dd]
>> [125:5] [0x988153]
>> <and so on>
>>
>>
>> The standard error:
>>
>> ------------- Processor 0 Exiting: Caught Signal ------------
>> Signal: 11
>> Rank 125 [Tue Jul 26 12:26:57 2011] [c1-0c2s2n3] application called
>> MPI_Abort(MPI_COMM_WORLD, 1) - process 125
>> ------------- Processor 125 Exiting: Caught Signal ------------
>> Signal: 6
>> Rank 124 [Tue Jul 26 12:26:57 2011] [c1-0c2s2n3] application called
>> MPI_Abort(MPI_COMM_WORLD, 1) - process 124
>> ------------- Processor 124 Exiting: Caught Signal ------------
>> Signal: 6
>> Rank 121 [Tue Jul 26 12:26:57 2011] [c1-0c2s2n3] application called
>> MPI_Abort(MPI_COMM_WORLD, 1) - process 121
>> ------------- Processor 121 Exiting: Caught Signal ------------
>> <and so on>
>>
>> Does anyone have a working build of 2.8 on XE6?
>>
>> Many thanks in advance,
>>
>> Tim
>>
>> --
>> Dr Tim Robinson
>> HPC Application Analyst
>> Swiss National Supercomputing Centre
>> Galleria 2, Via Cantonale
>> 6928 Manno
>>
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:57:30 CST