nvidia-docker on Ubuntu 17.04

August 10, 2017

After installing docker-ce, NVidia drivers, CUDA and cuDNN on your Ubuntu 17.04 laptop, it’s time to install nvidia-docker, which lets you access CUDA from inside a Docker container, which comes very useful.

nvidia-gpu-docker

Unfortunately the nvidia-docker release 1.0.1 DEB package has been built for Ubuntu 14.04, and thus requires such non-existent packages as sysv-rc and file-rc.

Guess what, you have to rebuild the DEB.

The NVidia team has provided their excuses at issue NVIDIA/nvidia-docker#412, so maybe by the time you’re reading this, release 2.0 has washed their sins away.

If you do not yet exist in such an advanced world, @stevefox has kindly provided a solution for us, by simply changing s/14.04/17.04/ in Dockerfile.deb which lets us clone that repository, switch to the nvidia-docker-u17.04 branch, run sudo make deb and sudo dpkg -i dist/nvidia-docker_1.0.1-1_amd64.deb.

Unfortunately, unless you’ve also installed nvidia-modprobe with sudo apt install nvidia-modprobe, starting the nvidia-docker service will fail with yon message:

$ sudo dpkg -i dist/nvidia-docker_1.0.1-1_amd64.deb 
Selecting previously unselected package nvidia-docker.
(Reading database ... 209723 files and directories currently installed.)
Preparing to unpack .../nvidia-docker_1.0.1-1_amd64.deb ...
Unpacking nvidia-docker (1.0.1-1) ...
Setting up nvidia-docker (1.0.1-1) ...
Configuring user
Setting up permissions
Created symlink /etc/systemd/system/multi-user.target.wants/nvidia-docker.service → /lib/systemd/system/nvidia-docker.service.
Job for nvidia-docker.service failed because the control process exited with error code.
See "systemctl status nvidia-docker.service" and "journalctl -xe" for details.
nvidia-docker.service couldn't start.
Processing triggers for ureadahead (0.100.0-19) ...

$ systemctl status nvidia-docker.service
● nvidia-docker.service - NVIDIA Docker plugin
   Loaded: loaded (/lib/systemd/system/nvidia-docker.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Fri 2017-08-11 22:06:03 EEST; 11s ago
     Docs: https://github.com/NVIDIA/nvidia-docker/wiki
  Process: 10751 ExecStopPost=/bin/rm -f $SPEC_FILE (code=exited, status=0/SUCCESS)
  Process: 10746 ExecStartPost=/bin/sh -c /bin/echo unix://$SOCK_DIR/nvidia-docker.sock > $SPEC_FILE (code=exited, status=0/SUCCESS)
  Process: 10731 ExecStartPost=/bin/sh -c /bin/mkdir -p $( dirname $SPEC_FILE ) (code=exited, status=0/SUCCESS)
  Process: 10730 ExecStart=/usr/bin/nvidia-docker-plugin -s $SOCK_DIR (code=exited, status=1/FAILURE)
 Main PID: 10730 (code=exited, status=1/FAILURE)
      CPU: 12ms

elo 11 22:06:01 gumibook systemd[1]: nvidia-docker.service: Unit entered failed state.
elo 11 22:06:01 gumibook systemd[1]: nvidia-docker.service: Failed with result 'exit-code'.
elo 11 22:06:03 gumibook systemd[1]: nvidia-docker.service: Service hold-off time over, scheduling restart.
elo 11 22:06:03 gumibook systemd[1]: Stopped NVIDIA Docker plugin.
elo 11 22:06:03 gumibook systemd[1]: nvidia-docker.service: Start request repeated too quickly.
elo 11 22:06:03 gumibook systemd[1]: Failed to start NVIDIA Docker plugin.
elo 11 22:06:03 gumibook systemd[1]: nvidia-docker.service: Unit entered failed state.
elo 11 22:06:03 gumibook systemd[1]: nvidia-docker.service: Failed with result 'exit-code'.

So install and restart:

$ sudo apt install nvidia-modprobe
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  nvidia-modprobe
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 16,5 kB of archives.
After this operation, 53,2 kB of additional disk space will be used.
Get:1 http://fi.archive.ubuntu.com/ubuntu zesty/multiverse amd64 nvidia-modprobe amd64 375.26-1 [16,5 kB]
Fetched 16,5 kB in 0s (147 kB/s)          
Selecting previously unselected package nvidia-modprobe.
(Reading database ... 209731 files and directories currently installed.)
Preparing to unpack .../nvidia-modprobe_375.26-1_amd64.deb ...
Unpacking nvidia-modprobe (375.26-1) ...
Setting up nvidia-modprobe (375.26-1) ...
Processing triggers for man-db (2.7.6.1-2) ...

$ sudo systemctl restart nvidia-docker

$ sudo systemctl status nvidia-docker
● nvidia-docker.service - NVIDIA Docker plugin
   Loaded: loaded (/lib/systemd/system/nvidia-docker.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2017-08-11 22:10:07 EEST; 3s ago
     Docs: https://github.com/NVIDIA/nvidia-docker/wiki
  Process: 11544 ExecStopPost=/bin/rm -f $SPEC_FILE (code=exited, status=0/SUCCESS)
  Process: 12491 ExecStartPost=/bin/sh -c /bin/echo unix://$SOCK_DIR/nvidia-docker.sock > $SPEC_FILE (code=exited, status=0/SUCCESS)
  Process: 12474 ExecStartPost=/bin/sh -c /bin/mkdir -p $( dirname $SPEC_FILE ) (code=exited, status=0/SUCCESS)
 Main PID: 12473 (nvidia-docker-p)
    Tasks: 9 (limit: 4915)
   Memory: 7.9M
      CPU: 144ms
   CGroup: /system.slice/nvidia-docker.service
           └─12473 /usr/bin/nvidia-docker-plugin -s /var/lib/nvidia-docker