Fwd: PCIexpress 3.0 for MD with NAMD on GPUs

From: Francesco Pietra (chiendarret_at_gmail.com)
Date: Tue Nov 19 2013 - 02:50:42 CST

OK, got PCIe 3.0 (LnkSta 8GT/s for both GTX-680) by sending the request
directly to the kernel.

In conclusion, the change from sandy bridge and 1066MHz RAM to ivy bridge
and 1866MHz RAM gave no NAMD2.9 acceleration for a system of 150K atoms
and some 13% acceleration with a system of 500K atoms. One might wonder
whether this is worth the money.

francesco pietra
---------- Forwarded message ----------
From: "Francesco Pietra" <chiendarret_at_gmail.com>
Date: Nov 18, 2013 8:13 AM
Subject: Fwd: namd-l: PCIexpress 3.0 for MD with NAMD on GPUs
To: "Thomas Albers" <talbers_at_binghamton.edu>, "NAMD" <namd-l_at_ks.uiuc.edu>,
"Lennart Sorensen" <lsorense_at_csclub.uwaterloo.ca>
Cc:

It is getting hard, unless I mistaken what was suggested by nvidia . Thus,
I added to GRUB the option suggested by nvidia by

1) typing 'e' at grub prompt,
2) adding the option to the linux line,
3) Ctrl-x to boot

If that procedure is correct (probably it is:

francesco_at_gig64:~$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-3.10-3-amd64 root=/dev/mapper/vg1-root ro 1.
nvidia.NVreg_EnablePCIeGen3=1 quiet
francesco_at_gig64:~$

no luck, both LnkCap and LnkSta were at at 5GT/s, as for PCIe 2.0.
Molecular dynamics, accordingly, was not accelerated.

I wonder whether "1." preceding "nvidia..." is what is needed for a grub
bootloader option. I did not find any other instance about that nvidia
suggestion on internet.

Hope someone can think better

francesco pietra

---------- Forwarded message ----------
From: Francesco Pietra <chiendarret_at_gmail.com>
Date: Sun, Nov 17, 2013 at 4:06 PM
Subject: Fwd: namd-l: PCIexpress 3.0 for MD with NAMD on GPUs
To: NAMD <namd-l_at_ks.uiuc.edu>

This addendum to let you know that simply adding

1. options nvidia NVreg_EnablePCIeGen3=1

to /etc/modprobe.d/nvidia.conf

as suggested in

https://devtalk.nvidia.com/default/topic/545186/enabling-pcie-3-0-with-nvreg_enablepciegen3-on-titan/

had no effect. Also, please note that what should be added to the kernel
boot string, according to the same source, is

   1. nvidia.NVreg_EnablePCIeGen3=1

unlike I wrote before (i.e., no "options", while a dot between nvidia and NVreg

francesco pietra

---------- Forwarded message ----------
From: Francesco Pietra <chiendarret_at_gmail.com>
Date: Sun, Nov 17, 2013 at 11:56 AM
Subject: Re: namd-l: PCIexpress 3.0 for MD with NAMD on GPUs
To: Thomas Albers <talbers_at_binghamton.edu>
Cc: Namd Mailing List <namd-l_at_ks.uiuc.edu>

Hello Thomas:
Thanks for sharing your benchmarks. It was very useful

With my Gigabyte X79-UD3, with two GTX-680, I replaced sandy i7-3930K with
ivy i7-4930K, and also replaced 1066MHz with 1866MHz RAM:

# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz
stepping : 4
microcode : 0x416
cpu MHz : 1200.000
cache size : 12288 KB
physical id : 0
siblings : 12
core id : 0
cpu cores : 6
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic
popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb
xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep
erms
bogomips : 6800.08
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

(the same for processors 1-11)

# cat /proc/driver/nvidia/versionNVRM version: NVIDIA UNIX x86_64 Kernel
Module 319.60 Wed Sep 25 14:28:26 PDT 2013GCC version: gcc version 4.7.3
(Debian 4.7.3-8)

 **************************************

I observed no speed increase for namd2.9 MD for a light job (150K atoms)
and only a few percent speed increase with a large job (500K atoms). All MD
simulations were carried out from the linux prompt, without X-server,
activating the GPUs with:

# nvidia-smi -L
# nvidia-smi -pm 1

***************************************

With all such MDs, both the capability LnCap and the actual speed link
LnkSta turned out to be 5GT/s, as for PCIe 2.0

I only observed a capability of 8GT/s when launching gnome:

# lspci -vvvv
02:00:0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX
680] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: NVIDIA Corporation Device 0969
..............
        LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Latency L0
<512ns, L1 <4us

VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 680] (rev
a1) (prog-if 00 [VGA controller])
    Subsystem: Micro-Star International Co., Ltd. Device 2820
.........................
        LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Latency L0
<512ns, L1 <4us
*****************************************
........................
LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Latency L0 <512ns,
L1 <4us

03:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX
680] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: Micro-Star International Co., Ltd. Device 2820
...................
LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Latency L0 <512ns,
L1 <4us
*******************************************

As far as I could investigate, nvidia, to activate PCIe 3.0 suggests to
either:
(1) Modify /etc/modprobe.d/local.conf (which does not exist on my debian
amd64 jessie) or create a new

/etc/modprobe.d/nvidia.conf, adding to that

1. options nvidia NVreg_EnablePCIeGen3=1

Actually, on my jessie, nvidia.conf reads

alias nvidia nvidia-current
remove nvidia-current rm mod nvidia

Some guys found that useless, even when both grub-efi and initramfs are
edited accordingly, so that nvidia offered a different move, updating the
kernel boot string, by appending this:

1. options nvidia NVreg_EnablePCIeGen3=1

Could you suggest about this? In the Gigabyte motherboard itself, I set
"automatic, which read correctly the speed of the CPU and RAM. I found no
settings for PCIe, unless this requires manual setting instead of
automatic. I have no experience about manipulating the kernel as suggested
above ny nvidia.

Thanks a lot
francesco pietra

PS: I did not try nvidia tools to investigate the link speed, nor CPU-Z
(which is a 32bit binary requiring installation of i386 libraries). The
latter would uninstall the 64bit nvidia-smi.

On Sat, Nov 16, 2013 at 4:58 PM, Thomas Albers <talbers_at_binghamton.edu>wrote:

> Hello!
>
> > Which version of the nvidia driver is needed to activate PCIexpress 3.0
> > between the GPUs and RAM for MD with NAMD2.9 or NAMD2.10? As far as I can
> > remember, nvidia deactivated PCIe 3.0 for linux from version 295.xx until
> > at least 310.xx. Is that correct?
>
> I am using an i5-Ivy Bridge CPU and a GTX 660 GPU with Nvidia driver
> 304.43 and can confirm that PCI-e 3.0 works. (I have not done
> benchmarking to see if there is any speedup compared to PCI-e 2.0.)
>
> # cat /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 58
> model name : Intel(R) Core(TM) i5-3570 CPU @ 3.40GHz
>
> # cat /proc/driver/nvidia/version
> NVRM version: NVIDIA UNIX x86_64 Kernel Module 304.43 Sun Aug 19
> 20:14:03 PDT 2012
> GCC version: gcc version 4.5.4 (Gentoo 4.5.4 p1.0, pie-0.4.7)
>
> # lspci -vvvv
> 01:00.0 VGA compatible controller: nVidia Corporation Device 11c0 (rev
> a1) (prog-if 00 [VGA controller])
> Subsystem: ZOTAC International (MCO) Ltd. Device 1281
> ....
> LnkSta: Speed 8GT/s, Width x16, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
>
> What the driver does is fall back to PCi-e 2.0 when not under load, so
> one has to check while crunching numbers on the GPU. If the GPU is
> idle it reports a 5 GT/s transfer rate. I do not know if this
> behaviour is peculiar to Nvidia or part of the PCI-e standard.
>
> Hope that helps,
> Thomas
>
>

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:55 CST