Re: Namd-3 alpha 7 error

From: Dr. Eddie (eackad_at_gmail.com)
Date: Wed Feb 24 2021 - 14:38:26 CST

I'll do so privately.
Thanks!
Eddie

On Wed, Feb 24, 2021 at 1:28 PM David Hardy <dhardy_at_ks.uiuc.edu> wrote:

> Hi Eddie,
>
> This looks similar to the error that Lorenzo recently reported.
>
> Would you be willing to share your data set for us to try to reproduce
> this error locally? Although we don’t have your exact hardware setup, we do
> have a few multi-GPU platforms in-house that we could try.
>
> Best regards,
> Dave
>
> --
> David J. Hardy, Ph.D.
> Beckman Institute
> University of Illinois at Urbana-Champaign
> 405 N. Mathews Ave., Urbana, IL 61801
> dhardy_at_ks.uiuc.edu, http://www.ks.uiuc.edu/~dhardy/
>
> On Feb 23, 2021, at 11:49 AM, Dr. Eddie <eackad_at_gmail.com> wrote:
>
> Hello,
> I'm trying to get a small 150k system working with namd3.
> I am using 4 gtx1080's with the command
> nice -n 5
> /home/eddie/binaries/NAMD_3.0alpha7_Linux-x86_64-multicore-CUDA/namd3 +p4
> +idlepoll +setcpuaffinity +devices 0,1,2,3 step5_production.inp
>
> I get the error:
> FATAL ERROR: CUDA error cub::DeviceSelect::If(d_temp_storage,
> temp_storage_bytes, hgi, hgi, d_nHG, natoms, notZero(), stream) in file
> src/SequencerCUDAKernel.cu, function buildRattleLists, line 4461
> on Pe 0 (node10.cl.siue.edu
> <https://urldefense.com/v3/__http://node10.cl.siue.edu__;!!DZ3fjg!vsAnMqOceXLf3noTlGGQZngwX7S0_fAHxuf_sBu3qz4_Up4XE-C_4gEdnrWauu7kXw$>
> device 0 pci 0:2:0): invalid device function
>
> and
> CUDANBOND[2]: Allocating patch data structure with 87 patches!
> CUDANBOND[3]: Allocating patch data structure with 101 patches!
> CUDANBOND[1]: Allocating patch data structure with 89 patches!
> CUDANBOND[0]: Allocating patch data structure with 114 patches!
> FATAL ERROR: CUDA error cub::DeviceSelect::If(d_temp_storage,
> temp_storage_bytes, hgi, hgi, d_nHG, natoms, notZero(), stream) in file
> src/SequencerCUDAKernel.cu, function buildRattleLists, line 4461
> on Pe 0 (node10.cl.siue.edu
> <https://urldefense.com/v3/__http://node10.cl.siue.edu__;!!DZ3fjg!vsAnMqOceXLf3noTlGGQZngwX7S0_fAHxuf_sBu3qz4_Up4XE-C_4gEdnrWauu7kXw$>
> device 0 pci 0:2:0): invalid device function
>
> Any ideas? I know these are commerical gpus, is that an issue?
> Thanks,
> Eddie
>
>
>

-- 
Eddie

This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:10 CST