VMD-L Mailing List
From: João Ribeiro (jribeiro_at_ks.uiuc.edu)
Date: Fri Feb 01 2019 - 08:59:31 CST
- Next message: jrhau lung: "Re: QwikMD and CUDA10"
- Previous message: jrhau lung: "Re: QwikMD and CUDA10"
- In reply to: jrhau lung: "Re: QwikMD and CUDA10"
- Next in thread: jrhau lung: "Re: QwikMD and CUDA10"
- Reply: jrhau lung: "Re: QwikMD and CUDA10"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Hi Jrhau,
Would it be possible for you to give us more details, like the size of the
system, the solvent type (if any) and the number of CPUs that you are
using? From what you are describing it must be quite small (at least for
the GPU in use).
Another thing to consider is how scattered is your system. Are you running
the simulation in implicit solvent or in vacuum? If not, It might be useful
to run the minimization and short equilibration using the multicore version
and then run the rest of the simulation using the CUDA version, to
eliminate sudden changes in the volume of the box during the equilibration.
Best
João
On Thu, Jan 31, 2019 at 5:58 PM jrhau lung <jrhaulung_at_gmail.com> wrote:
> Hi Joao and John
> Thanks for your guidance, I try to reduce the number of CPU core usage
> in the running, The simulation does not persist longer and is eventually
> automatically terminated from the same reason.
> The error message is attached below. And the problem persists even when
> switchs to the nightly built Linux-x86_64-multicore-CUDA
> <https://www.ks.uiuc.edu/Development/Download/download.cgi?UserID=&AccessCode=&ArchiveID=1584>
> NAMD.
> Current system is running on ubuntu 16.04LTS with VMD LinuxAMD64
> (1.9.4a12).
>
> sincerely
>
> Jrhau
>
> error message from running with fewer CPU cores
>
> ------------- Processor 16 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: ComputeBondedCUDA::copyTupleData, invalid number of
> exclusions
> FATAL ERROR: See http://www.ks.uiuc.edu/Research/namd/bugreport.html
>
> Charm++ fatal error:
> FATAL ERROR: ComputeBondedCUDA::copyTupleData, invalid number of exclusions
> FATAL ERROR: See http://www.ks.uiuc.edu/Research/namd/bugreport.html
>
> Info) IMD connection ended unexpectedly; connection terminated.
>
> error message from running with nightly built Linux-x86_64-multicore-CUDA
> <https://www.ks.uiuc.edu/Development/Download/download.cgi?UserID=&AccessCode=&ArchiveID=1584>
> NAMD.
>
> ------------ Processor 64 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: ComputeBondedCUDA::copyTupleData, invalid number of
> exclusions
> FATAL ERROR: See http://www.ks.uiuc.edu/Research/namd/bugreport.html
>
> Charm++ fatal error:
> FATAL ERROR: ComputeBondedCUDA::copyTupleData, invalid number of exclusions
> FATAL ERROR: See http://www.ks.uiuc.edu/Research/namd/bugreport.html
>
> ------------- Processor 64 Exiting: Called CmiAbort ------------
> Reason: FATAL ERROR: ComputeBondedCUDA::copyTupleData, invalid number of
> exclusions
> FATAL ERROR: See http://www.ks.uiuc.edu/Research/namd/bugreport.html
>
> Charm++ fatal error:
> FATAL ERROR: ComputeBondedCUDA::copyTupleData, invalid number of exclusions
> FATAL ERROR: See http://www.ks.uiuc.edu/Research/namd/bugreport.html
>
> while executing
> "::exec
> /home/jrhau/Downloads/NAMD_Git-2019-01-31_Linux-x86_64-multicore-CUDA/namd2
> +idlepoll +setcpuaffinity +p96 qwikmd_equilibration_0.conf >> qwikm..."
> ("eval" body line 1)
> invoked from within
> "eval ::exec [list $exec_path] [lrange $args 1 end]"
> (procedure "::ExecTool::exec" line 14)
> invoked from within
> "::ExecTool::exec namd2 +idlepoll +setcpuaffinity +p96
> qwikmd_equilibration_0.conf >> qwikmd_equilibration_0.log"
> ("eval" body line 1)
> invoked from within
> "eval ::ExecTool::exec $exec_command >> $conf.log"
> (procedure "QWIKMD::Run" line 250)
> invoked from within
> "QWIKMD::Run"
> invoked from within
> ".qwikmd.nbinput.f1.fcontrol.fcolapse.f1.run.button_Calculate invoke "
> invoked from within
> ".qwikmd.nbinput.f1.fcontrol.fcolapse.f1.run.button_Calculate instate
> {pressed !disabled} {
> .qwikmd.nbinput.f1.fcontrol.fcolapse.f1.run.button_Calculat..."
> (command bound to event)
>
> João Ribeiro <jribeiro_at_ks.uiuc.edu> 於 2019年1月31日 週四 下午11:04寫道:
>
>> Hi Jrhau,
>>
>> Thank you for reporting the error. The error that you are seeing is
>> related to the system size and the number of CPU cores+GPU that you
>> selected to run your simulation. I would guess that the system you are
>> running is not that big to justify many cores plus a GPU. Reduce the number
>> of CPU cores and the simulation should run smoothly.
>>
>> Now, on the performance of the multicore version of VMD, did you run
>> exactly the same configuration file?
>>
>> Please allow me to add some notes about the configuration files produced
>> by QwikMD. QwikMD has a lot of ludic behavior in the selection of the MD
>> parameters in the config files, namely, high frequency of energy outputs
>> and trajectory saving frequency (dcd freq), which is useful when you are
>> starting running simulations but has high penalties in the NAMD
>> performance. Also, running your simulations with the option "Live view"
>> mode (Interactive Molecular Dynamics activated) also decreases your
>> performance substantially, as NAMD-VMD communication occurs every so often.
>>
>> In summary, if you are trying to squeeze the most ns/day from your
>> machine, please increase periods of saving frames (dcd freq) and output
>> events (outputpressure, outputenergies and etc.) and run your simulations
>> in the background (Live view off == IMDon off).
>>
>> I hope this helps and I am also copying the NAMD developers on this
>> thread so they can comment further on the NAMD issues and how to improve
>> NAMD performance.
>>
>> Best
>>
>> Joao
>>
>>
>>
>> On Thu, Jan 31, 2019 at 5:44 AM jrhau lung <jrhaulung_at_gmail.com> wrote:
>>
>>> Dear VMD friends:
>>> In order to run simulation on new Geforce RTX20 vedio card, a NAMD
>>> was compiled from nightly build Git source to generate a
>>> Linux-x86_64-multicore-CUDA
>>> <https://www.ks.uiuc.edu/Development/Download/download.cgi?UserID=&AccessCode=&ArchiveID=1584>
>>> suported NAMD accroding to the recommend process in the relase note. The
>>> compile was successful without any error and should be successful as the
>>> Linux-x86_64-multicore
>>> <https://www.ks.uiuc.edu/Development/Download/download.cgi?UserID=&AccessCode=&ArchiveID=1584>
>>> verion NAMD was also compiled before generating the CUDA supported version
>>> and multicore version works fine with QwikMD in MD simulation.
>>> Unfortunately, running MD simulation using the self-built CUDA NAMD, the
>>> simulation aborted shortly after launch with the follwing messages. Any
>>> suggestions and hints would be highly appreciated.
>>>
>>> Info) Using multithreaded IMD implementation.
>>> ------------- Processor 64 Exiting: Called CmiAbort ------------
>>> Reason: FATAL ERROR: ComputeBondedCUDA::copyTupleData, invalid number of
>>> exclusions
>>> FATAL ERROR: See http://www.ks.uiuc.edu/Research/namd/bugreport.html
>>>
>>> Charm++ fatal error:
>>> FATAL ERROR: ComputeBondedCUDA::copyTupleData, invalid number of
>>> exclusions
>>> FATAL ERROR: See http://www.ks.uiuc.edu/Research/namd/bugreport.html
>>>
>>> Info) IMD connection ended unexpectedly; connection terminated.
>>>
>>> Another issue would like to have your comments is the simulation
>>> speed uisng self-built Linux-x86_64-multicore
>>> <https://www.ks.uiuc.edu/Development/Download/download.cgi?UserID=&AccessCode=&ArchiveID=1584>
>>> NAMD is significantly slower than of the 2.13 multicore version. What
>>> woulld be the potential causes for this? Is this related to the comfiling
>>> tools or libs? Thanks
>>>
>>> sincerely,
>>>
>>> Jrhau
>>>
>>>
>>>
>>
>> --
>> ……………………………………………………...
>> João Vieira Ribeiro
>> Theoretical and Computational Biophysics Group
>> Beckman Institute, University of Illinois
>> http://www.ks.uiuc.edu/~jribeiro/
>> jribeiro_at_ks.uiuc.edu
>> +1 (217) 3005851
>>
>
-- ……………………………………………………... João Vieira Ribeiro Theoretical and Computational Biophysics Group Beckman Institute, University of Illinois http://www.ks.uiuc.edu/~jribeiro/ jribeiro_at_ks.uiuc.edu +1 (217) 3005851
- Next message: jrhau lung: "Re: QwikMD and CUDA10"
- Previous message: jrhau lung: "Re: QwikMD and CUDA10"
- In reply to: jrhau lung: "Re: QwikMD and CUDA10"
- Next in thread: jrhau lung: "Re: QwikMD and CUDA10"
- Reply: jrhau lung: "Re: QwikMD and CUDA10"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]