can't run multiple jobs with multiple gpus

From: Gordon Wells (gordon.wells_at_gmail.com)
Date: Wed Apr 03 2013 - 15:42:44 CDT

I'm getting the following error when trying to run namd on an ubuntu
machine with multiple gpus:

------------ Processor 2 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: Pe 2 unable to bind to CUDA device 1 on fx8150 because
only 1 devices are present

Charm++ fatal error:
FATAL ERROR: Pe 2 unable to bind to CUDA device 1 on fx8150 because only 1
devices are present

The machines have two cards each and can run the first job fine, but
namd/charmrun fails to see the second card. This only started recently, but
as far as I know nothing on the machines have changed to cause this. I can
see both devices in /dev/nvidia* and both are listed with nvidia-smi

What could I be missing?

-- max(∫(εὐδαιμονία)dt)

Gordon Wells
Chemistry Department
Emory University
Atlanta, Georgia, USA

This archive was generated by hypermail 2.1.6 : Wed Dec 31 2014 - 23:21:05 CST