-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
integrate gpu-driver-util into the driver images #180
base: main
Are you sure you want to change the base?
Conversation
f0f0038
to
6761ca7
Compare
6761ca7
to
2865db2
Compare
Signed-off-by: Tariq Ibrahim <[email protected]>
2865db2
to
1ffa0da
Compare
cd driver/vgpu/src && \ | ||
go build -o vgpu-util && \ | ||
mv vgpu-util /work | ||
go build -C driver/vgpu/src -o vgpu-util && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: wondering if we should create a common folder for these go utilities? Like cmds
, tools
, ... ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think we could remove the nested vgpu/src
directory and instead just have vgpu
contain the go source files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I am open to doing this in a follow-up PR
# Download the nvidia-driver-assistant to get the latest supported-gpus.json file | ||
RUN apt-get update && \ | ||
apt-get install -y --no-install-recommends nvidia-driver-assistant && \ | ||
rm -rf /var/lib/apt/lists/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should download nvidia-driver-assistant
in the build step, and only copy over the supported-gpus.json
file to the final image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this means that we would have to add the cuda repo during that stage as well. Since we already download the fabricmanager and nscq packages from this state, I thought adding the nvidia-driver-assistant would be a simple change. Moreover, the nvidia-driver-assistant is a small package.
$ apt show nvidia-driver-assistant
Package: nvidia-driver-assistant
Version: 0.9.57.01-1
Priority: optional
Section: admin
Source: driver-assistant
Maintainer: Alberto Milone <[email protected]>
Installed-Size: 979 kB
Pre-Depends: dpkg (>= 1.15.7.2)
Depends: python3:any (>= 3.2~), python3-apt
Download-Size: 40.5 kB
APT-Manual-Installed: yes
APT-Sources: https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64 Packages
Description: Detect and install the best NVIDIA driver packages for the system
This piece of software is meant to help users deciding on which NVIDIA graphics
driver to install, based on the detected system's hardware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like 892kB/979kB is taken up by the supported-gpus.json
file.
$ ls -hal /usr/share/nvidia-driver-assistant/supported-gpus/supported-gpus*.json
-rw-r--r-- 1 root root 354K Oct 10 18:09 /usr/share/nvidia-driver-assistant/supported-gpus/supported-gpus.json
-rw-r--r-- 1 root root 538K Oct 10 18:09 /usr/share/nvidia-driver-assistant/supported-gpus/supported-gpus-mod.json
I don't think it's worth the added complexity to optimise for a couple hundreds of kilobytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is the dependency on python.
echo "cannot autodetect the kernel module type, please check /var/log/gpu-driver-util.log for more details..." | ||
return 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed on slack, should we cat the log file here before returning?
No description provided.