Re: Fwd: Fwd: namd on gtx-680

From: Axel Kohlmeyer (akohlmey_at_gmail.com)
Date: Wed Jun 20 2012 - 13:52:21 CDT

On Wed, Jun 20, 2012 at 1:57 PM, Francesco Pietra <chiendarret_at_gmail.com> wrote:
> Sorry, I previously forgot the list
>
>
> ---------- Forwarded message ----------
> From: Francesco Pietra <chiendarret_at_gmail.com>
> Date: Wed, Jun 20, 2012 at 7:37 PM
> Subject: Re: namd-l: Fwd: Fwd: namd on gtx-680
> To: Axel Kohlmeyer <akohlmey_at_gmail.com>
>
>
> I removed the Debian package "libcudart4", commented out the exporting
> LIBRARY in my .bshrc, and commanded
>
> export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/namd-cuda-2.9_2012-06-20
>
> then
>
> charmrun ... (as before)
>
>
>
> The simulation is now going on, and we will be able to compare with
> respect to the same number of steps (500,000), for the same
> equilibration, carried out at a previous stage with two GTX-580. The
> system is not very large, so that we can not expect any large
> advantage of the 680, IF ANY. The GPU hardware suppor must be
> accounted for; whatever the producers say, sandybridge (even -E, I
> suspect) is PCI Express 2.0. Withe lower memory bandwith of the 680 we
> risk even to get them slower than the 580. Personnally, I hope that
> one day MD codes will move to OpenCL, so that amd hardware can be used
> (from admittedly chance experience, I was always lucky with amd: I
> have no shares with them, just user). If not else they are cheaper : I
> don't expect any advantage from the present i7 with respect to
> previous Phenom, although the price is nearly quadruple (with much
> more heat and energy usge).
>
> So, as Axel suspected, the Debian lib is very likely not compatible (I
> am using the testing amd64, i.e. with the latest libraries, may be
> they are too recent for the NAMD compilation)
>
> Both GPUs temperature (65-70 centigrades) and memory usage are very
> similar to the 580 in last few milder days (now hot days here, no  air
> conditioning, however historic, solid building).
>
> Incidentally, may be because of the heat, I tried unsuccesfully to
> tune my .bashrc
>
> # For NAMD-CUDA 2.9 nighly build
> NAMD_HOME=/usr/local/namd-cuda-2.9_2012-06-20
> PATH=$PATH:$NAMD_HOME/bin/namd2; export NAMD_HOME PATH
> PATH="/usr/local/namd-cuda-2.9_2012-06-20/bin:$PATH"; export PATH
>
> if [ "$LD_LIBRARY_PATH" ] ; then
>   export LD_LIBRARY_PATH="/usr/local/namd-cuda-2.9_2012-06-20"
> else
>   export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/namd-cuda-2.9_2012-06-20"

> fi
>
> Thanks a lot again (and also for detecting the mistakes in the if ...,

your "fix" is wrong. the right fix is to keep the order of the if clauses
but switch the order of in the path to:

export LD_LIBRARY_PATH="namd-cuda-2.9_2012-06-20:$LD_LIBRARY_PATH"

that will search the NAMD directory first.

even better is to call namd from a wrapper script and hide
the LD_LIBRARY_PATH fix in there, so that other applications
are not redirected to the NAMD bundled cuda runtime, which
may be from a beta version and not fully backward compatible
and not fully tested.

axel.

>
> francesco pietra
>
> On Wed, Jun 20, 2012 at 5:11 PM, Axel Kohlmeyer <akohlmey_at_gmail.com> wrote:
>> On Wed, Jun 20, 2012 at 10:38 AM, Francesco Pietra
>> <chiendarret_at_gmail.com> wrote:
>>> Hi Vignesh:
>>>
>>> In my .bshrc;
>>>
>>> # For NAMD-CUDA 2.9 nighly build
>>> NAMD_HOME=/usr/local/namd-cuda-2.9_2012-06-20
>>> PATH=$PATH:$NAMD_HOME/bin/namd2; export NAMD_HOME PATH
>>> PATH="/usr/local/namd-cuda-2.9_2012-06-20/bin:$PATH"; export PATH
>>>
>>> if [ "$LD_LIBRARY_PATH" ] ; then
>>>   export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/namd-cuda-2.9_2012-06-20
>>
>> this is a mistake. the NAMD directory has to come first,
>> so LD_LIBRARY_PATH points to the version that ships
>> with NAMD and not the (likely incompatible) one you
>> installed elsewhere.
>>
>> axel.
>>
>>> else
>>>   export LD_LIBRARY_PATH="/usr/local/namd-cuda-2.9_2012-06-20"
>>> fi
>>>
>>> If I understand, this is practically a manual link. Should this be not
>>> what you mean, please instruct me.
>>>
>>> I must add that when I started (I had reinstalled amd64 raid1 because
>>> previous installation had been so much stressed), running charmrun I
>>> received the error message:
>>>
>>> "Error while loading shared libraries: libcudart.so.4"
>>>
>>> I resolven by installing the Debian amd64 package libcudart4. From
>>> this, it seems to me that, with Debian,  libcudart.so.4 is seen as a
>>> shared library (to reinforce my point above). If I am wrong, please
>>> correct me.
>>>
>>> francesco
>>>
>>>
>>> On Wed, Jun 20, 2012 at 4:05 PM, Vignesh <vignesh757_at_gmail.com> wrote:
>>>> Francesco,
>>>>
>>>> Did you try manually linking the libcudart.so.4 file before the run? I had
>>>> similar issues with gtx 560 , two cards on one machine.
>>>>
>>>> When I tried to let it run default using both cards, I got the same error,
>>>> prohibited mode ... but when I specified the two cards in the command, it
>>>> gave me a link error. All this got fixed by manually linking the .so.4 file
>>>> before the run.
>>>>
>>>> Ld_library_path = /../Namd_2.9-cuda
>>>>
>>>> Hope this helps. Let us know
>>>>
>>>> Vignesh
>>>>
>>>> On Jun 20, 2012 6:11 AM, "Francesco Pietra" <chiendarret_at_gmail.com> wrote:
>>>>>
>>>>> Resent because the accademia server if often unreliable. Sorry.
>>>>>
>>>>>
>>>>> ---------- Forwarded message ----------
>>>>> From: Francesco Pietra <francesco.pietra_at_accademialucchese.it>
>>>>> Date: Wed, Jun 20, 2012 at 12:08 PM
>>>>> Subject: Re: Fwd: namd on gtx-680
>>>>> To: Jim Phillips <jim_at_ks.uiuc.edu>, NAMD <namd-l_at_ks.uiuc.edu>
>>>>>
>>>>>
>>>>> Hi Jim:
>>>>>
>>>>> My attempts at running namd-cuda 2.9, nightly build 2012-06-20, on
>>>>> GTX-680 failed. I used the same procedure as with my previous Gigabyte
>>>>> 890FXA / AMD Phenom II / two  GTX-580. The latter had no problems, and
>>>>> i tried now to continue the already advanced equilibration of a
>>>>> protein in a water box, carried out under parm7 ff.
>>>>>
>>>>> HARDWARE: Gigabyte X79-UD3 , Intel i7-3930k, two GTX-680 on x16 lanes.
>>>>>
>>>>> OS: Debian amd64 wheezy (testing), cuda driver 295.53-1.
>>>>>
>>>>> COMMANDS:
>>>>> - nvidia-smi -L
>>>>> GPU 0: GeForce GTX 680 (UUID: N/A)
>>>>> GPU 1 : as above
>>>>>
>>>>> -nvidia-smi -pm 1
>>>>> Enabled persistence mode for GPU 0000:02:00.0.
>>>>>         "                                              0000:03:00.0
>>>>>
>>>>> -charmrun $NAMD_HOME/bin/namd2 xxx.conf +p6 +idlepoll 2>&1 | tee xxx.log
>>>>>
>>>>> The log file tells;
>>>>> Running command: /usr/local/namd-cuda-2.9_2012-06-20/bin/namd2
>>>>> press-03.conf +p6 +idlepoll
>>>>>
>>>>> Charm++: standalone mode (not using charmrun)
>>>>> Converse/Charm++ Commit ID: v6.4.0-beta1-0-g5776d21
>>>>> CharmLB> Load balancer assumes all CPUs are same.
>>>>> Charm++> Running on 1 unique compute nodes (12-way SMP).
>>>>> Charm++> cpu topology info is gathered in 0.001 seconds.
>>>>> Info: NAMD CVS-2012-06-20 for Linux-x86_64-multicore-CUDA
>>>>> Info:
>>>>> Info: Please visit http://www.ks.uiuc.edu/Research/namd/
>>>>> Info: for updates, documentation, and support information.
>>>>> Info:
>>>>> Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
>>>>> Info: in all publications reporting results obtained with NAMD.
>>>>> Info:
>>>>> Info: Based on Charm++/Converse 60400 for multicore-linux64-iccstatic
>>>>> Info: Built Wed Jun 20 02:24:32 CDT 2012 by jim on lisboa.ks.uiuc.edu
>>>>> Info: 1 NAMD  CVS-2012-06-20  Linux-x86_64-multicore-CUDA  6    gig64
>>>>>  francesco
>>>>> Info: Running on 6 processors, 1 nodes, 1 physical nodes.
>>>>> Info: CPU topology information available.
>>>>> Info: Charm++/Converse parallel runtime startup completed at 0.0799868 s
>>>>> FATAL ERROR: CUDA error on Pe 5 (gig64 device 0): All CUDA devices are
>>>>> in prohibited mode, of compute capability 1.0, or otherwise unusable.
>>>>> ------------- Processor 5 Exiting: Called CmiAbort ------------
>>>>> Reason: FATAL ERROR: CUDA error on Pe 5 (gig64 device 0): All CUDA
>>>>> devices are in prohibited mode, of compute capability 1.0, or
>>>>> otherwise unusable.
>>>>>
>>>>> FATAL ERROR: CUDA error on Pe 1 (gig64 device 0): All CUDA devices are
>>>>> in prohibited mode, of compute capability 1.0, or otherwise unusable.
>>>>> Program finished.
>>>>> FATAL ERROR: CUDA error on Pe 4 (gig64 device 0): All CUDA devices are
>>>>> in prohibited mode, of compute capability 1.0, or otherwise unusable.
>>>>> *************
>>>>>
>>>>> Would it be safe not to sell the GTX-580  and set aside the GTX-680
>>>>> for the future? Oor is anything that can be changed in my procedure? i
>>>>> am not a pessimist, just I am still in time to take the GTX-580 back
>>>>> from the vendor where I did the business.
>>>>>
>>>>> Thanks a lot for advising me with urgency (because of the said
>>>>> business, and I beg pardon very much for asking urgency)
>>>>>
>>>>> francesco Pietra
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Jun 7, 2012 at 11:52 PM, Jim Phillips <jim_at_ks.uiuc.edu> wrote:
>>>>> >
>>>>> > Yes it can.  -Jim
>>>>> >
>>>>> > On Thu, 7 Jun 2012, David Brandon wrote:
>>>>> >
>>>>> >> Below came into biocore email:
>>>>> >>
>>>>> >>
>>>>> >> -------- Original Message --------
>>>>> >> Subject: namd on gtx-680
>>>>> >> Date: Thu, 7 Jun 2012 09:04:45 +0200
>>>>> >> From: Francesco Pietra <francesco.pietra_at_accademialucchese.it>
>>>>> >> To: biocore_at_ks.uiuc.edu
>>>>> >>
>>>>> >> Hi:
>>>>> >>
>>>>> >> May I ask whether namd 2.9, or a night build, can run on nvidia
>>>>> >> GTX-680? (I am at Debian GNU Linux amd64)
>>>>> >>
>>>>> >> I could not obtain the information from the general forum, nor was I
>>>>> >> able to find the question posed to biocore.
>>>>> >>
>>>>> >> Because of trivial commercial problems, I am short of time in deciding
>>>>> >> to change my two GTX-580 for two GTX-680.
>>>>> >>
>>>>> >> Thanks a lot
>>>>> >>
>>>>> >> francesco pietra
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> Francesco Pietra
>>>>> >> Professor of Chemistry
>>>>> >> Accademia Lucchese di Scienze, Lettere e Arti, founded in 1584
>>>>> >> Palazzo Ducale
>>>>> >> I-55100 Lucca
>>>>> >> tel/fax +39 0583 417336
>>>>> >>
>>>>> >> --
>>>>> >> David P. Brandon, Manager
>>>>> >> Theoretical and Computational Biophysics Group
>>>>> >> Center for Macromolecular Modeling & Bioinformatics
>>>>> >> 3027 Beckman Institute
>>>>> >> 405 N. Mathews Avenue
>>>>> >> Urbana, IL  61801
>>>>> >> ph:  217-265-6480
>>>>> >> fax: 217-244-6078
>>>>> >> site: www.ks.uiuc.edu
>>>>> >>
>>>>> >
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Dr. Axel Kohlmeyer
>> akohlmey_at_gmail.com  http://goo.gl/1wk0
>>
>> College of Science and Technology
>> Temple University, Philadelphia PA, USA.

-- 
Dr. Axel Kohlmeyer
akohlmey_at_gmail.com  http://goo.gl/1wk0
College of Science and Technology
Temple University, Philadelphia PA, USA.

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:21:41 CST