Assign image tensors to data_device
immediately on creation.
#667
+3
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The tensors which are created from PIL images are first created on the CPU.
gaussian-splatting/utils/general_utils.py
Line 23 in d9fad7b
If
data_device
is "cuda" they are later moved to the GPU. Normally, unreferenced tensors on the CPU should be released but PyTorch doesn't seem to do this. This results in high CPU RAM consumption for the entire training duration even whendata_device
is "cuda".Moving the tensors to
data_device
immediately on creation results in a dramatic decrease in CPU RAM consumption whendata_device
is "cuda". When training on a T4 instance on Colab with 200 images, CPU RAM consumption went from 10GB down to 2GB. The GPU vRAM consumption doesn't increase as tensors are eventually moved to the GPU anyway.It might help to move all tensors to
data_device
immediately on creation since PyTorch doesn't seem to deallocate RAM for CPU tensors.