Re: Possible bug on NAMD 2.6B1?

From: Cesar Luis Avila (cavila_at_fbqf.unt.edu.ar)
Date: Tue Jul 25 2006 - 09:39:02 CDT

Dear Jim,
Have you tried NAMD 2.6b1 on AMD64 dual core machines? I still have some
random failures when running simulations on this architecture the most
common being

> ------------- Processor 1 Exiting: Caught Signal ------------
> Signal: segmentation violation
> Suggestion: Try running with '++debug', or linking with '-memory
> paranoid'. Fatal error on PE 1> segmentation violation

the Processor number is also random. I thought that I had already solved
this by recompiling charm++ with gcc3 option, but now I am getting this
message again. Nevertheless there are some other error messages that
have dissapeared. I am now wondering if the problems are due to the
compiler version I am using.

If you have working binaries for this architecture could you please send
them to me? This way I would be able to discard hardware problems. The
nodes are connected through gigabit ethernet.

Best Regards
Cesar Avila

Jim Phillips escribió:
>
> gcc and net-linux-amd64 should work fine. Ignore the Fortran warnings.
>
> -Jim
>
>
> On Tue, 4 Jul 2006, Cesar Luis Avila wrote:
>
>> I have checked outputtimings and indeed I have negative values for
>> some wall time/step.
>>
>> I am now trying to recompile charmm/namd for a cluster of dual core
>> AMD64 connected through gigabit ethernet. I understand that the best
>> choice for charm++ compiling is to use net-linux-amd64 instead of
>> mpi-linux-amd64 for this kind of connection. Is this correct?
>> I would also like to know which compilers work better for this
>> architecture. I recall that some groups suggested to use Intel C and
>> Fortran compilers on AMD64, but there was also some problems with
>> them. So, others suggested to use gnu compilers. I currently have
>> gcc and g77 installed on my system, although g77 seems to be useless
>> for building charm++.
>>
>> During charm autoconfigure I get the message.
>> checking subroutine name used by Fortran compiler... "Fortran
>> compiler not working"
>>
>>
>> Regards
>> Cesar
>>
>> Cesar Luis Avila wrote:
>>> Dear all,
>>> While running minimization for my system on AMD64 Dual Core, I
>>> always get stuck after step 199. The following is extracted from log:
>>>
>>> ---------------------
>>> BRACKET: 1.20523e-06 0.239614 -230750 14968.7 528140
>>> ENERGY: 199 6081.6400 2704.6967 0.0000
>>> 488.2582 -183481.0235 -8652.9114 0.0000
>>> 0.0000 0.0000 -182859.3400 0.0000 -182859.3400
>>> -182859.3400 0.0000 -6250.9615 -5563.7037
>>> 864000.0000 -6250.9615 -5563.7037
>>>
>>> LDB: LOAD: AVG 76.2111 MAX 85.6205 MSGS: TOTAL 84 MAXC 21 MAXP 3
>>> None
>>> LDB: LOAD: AVG 76.2111 MAX 86.8742 MSGS: TOTAL 84 MAXC 21 MAXP 3
>>> Alg7
>>>
>>> -----------------------
>>>
>>> I wonder if it has something to do with the bug reported on the wiki
>>> by Jim Philips.
>>>
>>>
>>> Load balancer hangs on dual-core Opteron clusters, including
>>> Cray XD1
>>>
>>> If so, do I have to build charmm, namd or both from sources?
>>>
>>> I am currently using Linux-amd64-TCP
>>> <http://www.ks.uiuc.edu/Development/Download/download.cgi?UserID=&AccessCode=&ArchiveID=792>
>>> binaries for version 2.6b1downloaded from NAMD website.
>>> I did not have the same problem while running the benchmark APOA1.
>>>
>>> Regards
>>> Cesar
>>>
>>>
>>>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:42:24 CST