malloc memory error using CUDA devices

From: Blake Mertz (blake.mertz_at_mail.wvu.edu)
Date: Wed Oct 31 2012 - 07:25:11 CDT

Hello,

I've been attempting to troubleshoot this problem, with no progress so
far, and was hoping someone had run into this before. I'm using a
pre-compiled NAMD2.9 x86_64 multicore CUDA build on a debian 6.0
machine, using debian's nvidia drivers (304.18) and cuda libraries.
I've verified that these drivers will work using namd on another
similar build, so I know the drivers are not the issue.

While attempting to run the apoa1 benchmark, I get the following error
after specifying using my GPU card:

namd2 +idlepoll +p3 +devices 1 apoa1.namd

NAMD will start up, but after startup phase I get this:

Pe 2 has 72 local and 72 remote patches and 1944 local and 1944 remote computes.
FATAL ERROR: CUDA error malloc everything on Pe 2 (NEI-GPU device 1):
out of memory
------------- Processor 2 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error malloc everything on Pe 2 (NEI-GPU
device 1): out of memory

And here is my output from nvidia-smi:

Wed Oct 31 08:24:29 2012
+------------------------------------------------------+
| NVIDIA-SMI 4.304.48 Driver Version: 304.48 |
|-------------------------------+----------------------+----------------------+
| GPU Name | Bus-Id Disp. | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 nForce 980a/780a SLI | 0000:02:00.0 N/A | N/A |
| N/A 63C N/A N/A / N/A | 35% 43MB / 125MB | N/A Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 470 | 0000:05:00.0 N/A | N/A |
| 40% 32C N/A N/A / N/A | 0% 4MB / 1279MB | N/A Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Compute processes: GPU Memory |
| GPU PID Process name Usage |
|=============================================================================|
| 0 Not Supported |
| 1 Not Supported |
+-----------------------------------------------------------------------------+

So I know I shouldn't be running out of memory on the GTX470, since
I'm only using 4MB out of 1.2GB. I've had the same error when using a
GTX680 instead of the 470. If anyone has some help with this issue, it
would be greatly appreciated. Thanks.

Blake

--
Assistant Professor
C. Eugene Bennett Department of Chemistry
(304) 293-9166
"Life is not easy for any of us. But what of that? We must have
perseverance and above all confidence in ourselves. We must believe
that we are gifted for something and that this thing must be
attained." Marie Curie
"Start by doing what's necessary; then do what's possible; and
suddenly you are doing the impossible." St. Francis of Assissi

This archive was generated by hypermail 2.1.6 : Mon Dec 31 2012 - 23:22:13 CST