RE: CUDA simulation memory usage

From: Jesper Sørensen (lists_at_jsx.dk)
Date: Thu May 19 2011 - 03:11:41 CDT

Hi Jim,

Thanks for the reply. You are right, at the beginning NAMD writes:

Info: 128.391 MB of memory in use based on /proc/self/stat

Then at the end:

WallClock: 650.799744 CPUTime: 650.799744 Memory: 41191.664062 MB

I am running CUDA version 3.2.16 and NAMD 2.7. For information I am running with Intel 11.1 compilers and OpenMPI 1.4.1.
I figured it was a "useless" memory number, but I figured I'd ask why anyways :-)

I'll upgrade to NAMD 2.8 once beta-testing is completed :-)

Thanks again,
Jesper

-----Original Message-----
From: Jim Phillips [mailto:jim_at_ks.uiuc.edu]
Sent: 19. maj 2011 00:38
To: Jesper Sørensen
Cc: 'namd-l'
Subject: Re: namd-l: CUDA simulation memory usage

At the beginning of the run there is probably a line like this:

Info: 59.8477 MB of memory in use based on /proc/self/stat

Would you by chance be using CUDA 4.0 drivers? We have a machine with 256GB of memory that shows 272g VIRT in top for any process using CUDA but machines with older drivers don't have this issue.

What I think is happening is that /proc/self/stat's vsize field is including the entire virtual address space allocated by CUDA, which is pretty useless for most purposes but does at least match what top reports in its VIRT column.

Sine this looks like an unavoidable property of the new CUDA drivers I've modified memusage.C to avoid reporting vsize in CUDA builds.

-Jim

On Wed, 18 May 2011, Jesper Sørensen wrote:

> Hi,
>
> I have just been benchmarking our new cluster with GPUs and the memory usage that NAMD prints out at the end of the simulation run is MUCH larger with GPU's than without the GPU's.
>
> With CUDA:
> WallClock: 1246.348877 CPUTime: 1246.348877 Memory: 41306.324219 MB
>
> Without CUDA
> WallClock: 4025.868896 CPUTime: 4025.868896 Memory: 350.937500 MB
>
> I am assuming that the Memory number with CUDA is wrong - mostly because I know that we don't have that much memory in these new machines. Is it taking memory on the GFX-card into account, or what is going on?
>
> I've looked through the mailinglist, but I haven't been able to find anything on this issue...
>
> Best regards,
>
> Jesper Sørensen
> Dept. of Chemistry
> Aarhus University, Denmark
>
>

This archive was generated by hypermail 2.1.6 : Wed Feb 29 2012 - 15:57:09 CST