Fwd: PCIexpress 3.0 for MD with NAMD on GPUs

From: Francesco Pietra (chiendarret_at_gmail.com)
Date: Sun Nov 17 2013 - 09:06:51 CST

This addendum to let you know that simply adding

1. options nvidia NVreg_EnablePCIeGen3=1

to /etc/modprobe.d/nvidia.conf

as suggested in

https://devtalk.nvidia.com/default/topic/545186/enabling-pcie-3-0-with-nvreg_enablepciegen3-on-titan/

had no effect. Also, please note that what should be added to the kernel
boot string, according to the same source, is

   1. nvidia.NVreg_EnablePCIeGen3=1

unlike I wrote before (i.e., no "options", while a dot between nvidia and NVreg

francesco pietra

---------- Forwarded message ----------
From: Francesco Pietra <chiendarret_at_gmail.com>
Date: Sun, Nov 17, 2013 at 11:56 AM
Subject: Re: namd-l: PCIexpress 3.0 for MD with NAMD on GPUs
To: Thomas Albers <talbers_at_binghamton.edu>
Cc: Namd Mailing List <namd-l_at_ks.uiuc.edu>

Hello Thomas:
Thanks for sharing your benchmarks. It was very useful

With my Gigabyte X79-UD3, with two GTX-680, I replaced sandy i7-3930K with
ivy i7-4930K, and also replaced 1066MHz with 1866MHz RAM:

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz
stepping : 4
microcode : 0x416
cpu MHz : 1200.000
cache size : 12288 KB
physical id : 0
siblings : 12
core id : 0
cpu cores : 6
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic
popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb
xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep
erms
bogomips : 6800.08
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

(the same for processors 1-11)

# cat /proc/driver/nvidia/versionNVRM version: NVIDIA UNIX x86_64 Kernel
Module 319.60 Wed Sep 25 14:28:26 PDT 2013GCC version: gcc version 4.7.3
(Debian 4.7.3-8)

 **************************************

I observed no speed increase for namd2.9 MD for a light job (150K atoms)
and only a few percent speed increase with a large job (500K atoms). All MD
simulations were carried out from the linux prompt, without X-server,
activating the GPUs with:

# nvidia-smi -L
# nvidia-smi -pm 1

***************************************

With all such MDs, both the capability LnCap and the actual speed link
LnkSta turned out to be 5GT/s, as for PCIe 2.0

I only observed a capability of 8GT/s when launching gnome:

# lspci -vvvv
02:00:0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX
680] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: NVIDIA Corporation Device 0969
..............
        LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Latency L0
<512ns, L1 <4us

VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 680] (rev
a1) (prog-if 00 [VGA controller])
    Subsystem: Micro-Star International Co., Ltd. Device 2820
.........................
        LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Latency L0
<512ns, L1 <4us
*****************************************
........................
LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Latency L0 <512ns,
L1 <4us

03:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX
680] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: Micro-Star International Co., Ltd. Device 2820
...................
LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Latency L0 <512ns,
L1 <4us
*******************************************

As far as I could investigate, nvidia, to activate PCIe 3.0 suggests to
either:
(1) Modify /etc/modprobe.d/local.conf (which does not exist on my debian
amd64 jessie) or create a new

/etc/modprobe.d/nvidia.conf, adding to that

1. options nvidia NVreg_EnablePCIeGen3=1

Actually, on my jessie, nvidia.conf reads

alias nvidia nvidia-current
remove nvidia-current rm mod nvidia

Some guys found that useless, even when both grub-efi and initramfs are
edited accordingly, so that nvidia offered a different move, updating the
kernel boot string, by appending this:

1. options nvidia NVreg_EnablePCIeGen3=1

Could you suggest about this? In the Gigabyte motherboard itself, I set
"automatic, which read correctly the speed of the CPU and RAM. I found no
settings for PCIe, unless this requires manual setting instead of
automatic. I have no experience about manipulating the kernel as suggested
above ny nvidia.

Thanks a lot
francesco pietra

PS: I did not try nvidia tools to investigate the link speed, nor CPU-Z
(which is a 32bit binary requiring installation of i386 libraries). The
latter would uninstall the 64bit nvidia-smi.

On Sat, Nov 16, 2013 at 4:58 PM, Thomas Albers <talbers_at_binghamton.edu>wrote:

> Hello!
>
> > Which version of the nvidia driver is needed to activate PCIexpress 3.0
> > between the GPUs and RAM for MD with NAMD2.9 or NAMD2.10? As far as I can
> > remember, nvidia deactivated PCIe 3.0 for linux from version 295.xx until
> > at least 310.xx. Is that correct?
>
> I am using an i5-Ivy Bridge CPU and a GTX 660 GPU with Nvidia driver
> 304.43 and can confirm that PCI-e 3.0 works. (I have not done
> benchmarking to see if there is any speedup compared to PCI-e 2.0.)
>
> # cat /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 58
> model name : Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz
>
> # cat /proc/driver/nvidia/version
> NVRM version: NVIDIA UNIX x86_64 Kernel Module 304.43 Sun Aug 19
> 20:14:03 PDT 2012
> GCC version: gcc version 4.5.4 (Gentoo 4.5.4 p1.0, pie-0.4.7)
>
> # lspci -vvvv
> 01:00.0 VGA compatible controller: nVidia Corporation Device 11c0 (rev
> a1) (prog-if 00 [VGA controller])
> Subsystem: ZOTAC International (MCO) Ltd. Device 1281
> ....
> LnkSta: Speed 8GT/s, Width x16, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
>
> What the driver does is fall back to PCi-e 2.0 when not under load, so
> one has to check while crunching numbers on the GPU. If the GPU is
> idle it reports a 5 GT/s transfer rate. I do not know if this
> behaviour is peculiar to Nvidia or part of the PCI-e standard.
>
> Hope that helps,
> Thomas
>
>

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:55 CST