Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run nvidia-smi after installing the nv-drivers #264

Merged
merged 1 commit into from
Jan 22, 2025

Conversation

ArangoGutierrez
Copy link
Collaborator

No description provided.

@@ -35,6 +35,8 @@ sudo dpkg -i cuda-keyring_1.1-1_all.deb

with_retry 3 10s sudo apt-get update
install_packages_with_retry cuda-drivers

with_retry 3 10s nvidia-smi -L
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why with retry?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, not needed, edited

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a chance that the previous step exits before the driver is ready a retry may make sense. What does the driver container do?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a step fails, it fails fast so that this new line won't be run.
The driver container has this steps to load the modules -> https://github.com/NVIDIA/gpu-driver-container/blob/main/ubuntu24.04/nvidia-driver#L222

maybe is something worth replicating here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Let's keep this as is and update if required later.

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
@ArangoGutierrez ArangoGutierrez merged commit b4801da into NVIDIA:main Jan 22, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants