Re: Compiling NAMD on AMD64 dual core

From: Cesar Luis Avila (cavila_at_fbqf.unt.edu.ar)
Date: Mon Jul 10 2006 - 10:57:59 CDT

Dear all,
I have recently recompiled namd for amd64 using gcc, tcp and
net-linux-amd64. Now I am trying to run a minimization through 4
processors on 2 nodes and it hangs with the following output.

ENERGY: 5991 6048.5143 3249.3845 0.0000
12.0108 -231731.3866 -2506.2691 0.0000
0.0000 0.0000 -224927.7461 0.0000
-224927.7461 -224927.7461 0.0000 -5379.9655
-5348.7712 1048576.0000 -5379.9655 -5348.7712

BRACKET: 1.3589e-08 0.00213979 27397.5 27397.5 280682
NEW SEARCH DIRECTION
INITIAL STEP: 5e-07
GRADIENT TOLERANCE: 810.801
------------- Processor 2 Exiting: Caught Signal ------------
Signal: segmentation violation
Suggestion: Try running with '++debug', or linking with '-memory paranoid'.
Fatal error on PE 2> segmentation violation

I have recompiled NAMD once more without tcp and this time the error was
even worst, not only the proccess but the whole node hanged up. I could
only read on the monitor
not syncing, Kernel panic: Aiee, killing interrupt handler!.

How may I solve this?

Regards
Cesar

Jim Phillips escribió:
>
> gcc and net-linux-amd64 should work fine. Ignore the Fortran warnings.
>
> -Jim
>
>
> On Tue, 4 Jul 2006, Cesar Luis Avila wrote:
>
>> I have checked outputtimings and indeed I have negative values for
>> some wall time/step.
>>
>> I am now trying to recompile charmm/namd for a cluster of dual core
>> AMD64 connected through gigabit ethernet. I understand that the best
>> choice for charm++ compiling is to use net-linux-amd64 instead of
>> mpi-linux-amd64 for this kind of connection. Is this correct?
>> I would also like to know which compilers work better for this
>> architecture. I recall that some groups suggested to use Intel C and
>> Fortran compilers on AMD64, but there was also some problems with
>> them. So, others suggested to use gnu compilers. I currently have
>> gcc and g77 installed on my system, although g77 seems to be useless
>> for building charm++.
>>
>> During charm autoconfigure I get the message.
>> checking subroutine name used by Fortran compiler... "Fortran
>> compiler not working"
>>
>>
>> Regards
>> Cesar
>>
>> Cesar Luis Avila wrote:
>>> Dear all,
>>> While running minimization for my system on AMD64 Dual Core, I
>>> always get stuck after step 199. The following is extracted from log:
>>>
>>> ---------------------
>>> BRACKET: 1.20523e-06 0.239614 -230750 14968.7 528140
>>> ENERGY: 199 6081.6400 2704.6967 0.0000
>>> 488.2582 -183481.0235 -8652.9114 0.0000
>>> 0.0000 0.0000 -182859.3400 0.0000 -182859.3400
>>> -182859.3400 0.0000 -6250.9615 -5563.7037
>>> 864000.0000 -6250.9615 -5563.7037
>>>
>>> LDB: LOAD: AVG 76.2111 MAX 85.6205 MSGS: TOTAL 84 MAXC 21 MAXP 3
>>> None
>>> LDB: LOAD: AVG 76.2111 MAX 86.8742 MSGS: TOTAL 84 MAXC 21 MAXP 3
>>> Alg7
>>>
>>> -----------------------
>>>
>>> I wonder if it has something to do with the bug reported on the wiki
>>> by Jim Philips.
>>>
>>>
>>> Load balancer hangs on dual-core Opteron clusters, including
>>> Cray XD1
>>>
>>> If so, do I have to build charmm, namd or both from sources?
>>>
>>> I am currently using Linux-amd64-TCP
>>> <http://www.ks.uiuc.edu/Development/Download/download.cgi?UserID=&AccessCode=&ArchiveID=792>
>>> binaries for version 2.6b1downloaded from NAMD website.
>>> I did not have the same problem while running the benchmark APOA1.
>>>
>>> Regards
>>> Cesar
>>>
>>>
>>>
>>
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:43:48 CST