The HIP version of NAMD gets wrong results when computing on more than one node

From: 张驭洲 (zhangyuzhou15_at_mails.ucas.edu.cn)
Date: Fri Jul 03 2020 - 04:58:16 CDT

Hello,

I noticed that there is a HIP version of NAMD in the gerrit repository of NAMD. I tried it using the apoa1 and stmv benchmark. The results of single node with multi GPU seem right, but when using more than one node, the total energy keeps increasing, and sometimes the computation even crashes because of too fast moving of atoms. I used the ucx-linux-x86_64-ompipmix-smp building of charm-6.10.1. Could anyone give me some hints about this problem?

Sincerely,

Zhang

This archive was generated by hypermail 2.1.6 : Fri Dec 31 2021 - 23:17:09 CST