-
-
Notifications
You must be signed in to change notification settings - Fork 454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Issue]: High VRAM usage during Vae step #3416
Comments
i cannot reproduce. i've added some extra logging, please set env variable |
re-ran it with the env variable active.
|
i can see some the difference with vs without config: 8.1gb vs 9.6gb also, no matter what i do, i cannot reproduce this. |
I did a fresh installation and the issue persisted. I realized that the configs are cached in users/myusername/.cache/huggingface. I just deleted all of that, but are there any other shared locations for cached data to hide that might be contributing to my problem? |
downloaded config is in |
(had to delete my prior comment, formatting got jumbled) My vram usage spikes above 10 gb per task manager and the webui under the preview image (labeled as GPU active). Vram usage is a bit inconsistent overall, there's probably some GC tweaking that I need to do. My hunch is that vae tiling isn't being applied, but that's based only on the pattern I see. Vram usage is identical with it on or off when using the cached configuration. Let me know if there's anything else I can try.
|
ah, i may have found it. seems like vae was not typecast to fp16 if config was specified. so even if upcast is disabled, its pointless since its loaded as fp32. update and try to reproduce. if issue persists, update here and i'll reopen. |
cached config OFF.log Issue still persists. I've attached screenshots of the webui generation info + screenshots of task manager during each run. Cached config uses significantly more vram and starts using shared memory. I used a fresh instance of sdnext dev without extensions. I ran 2 generations and attached the logs with --debug and sd_vae_debug=true env variable. |
i've reopened if someone wants to take a shot at it. |
my system spiked twice and crashed my system, just saying hes not the only one. |
general statements without logs or any info on platform or settings are not helpful. |
#3471 |
that item not related at all. |
there is an issue with how your system is handling diffusers. |
maybe there is. create an issue and document it. do not post random comments on completely unrelated issues. |
Issue Description
Vram usage during the VAE step is inconsistent and will spike to >12 gb for an sdxl model. This is atypical for my usage, where an SDXL model will stay at 10gb or less during the vae step with my settings all applied:
1024x1024 10 steps dpm ++2m, sdxl timestep presets used, cfg = 3, no attention guidance, no loras applied.
Disabling "use cached model config when available" removes the issue, and generation speeds will be 8 -10 seconds.
VRAM usage in the console does not reflect the usage as seen in task manager or in the webui, attached is a screenshot of the vram usage during a run.
sdnext (1).log
Version Platform Description
13:30:50-670748 INFO Logger: file="C:\Users\zaxof\OneDrive\Documents\GitHub\nvidia_sdnext\sdnext.log" level=DEBUG
size=65 mode=create
13:30:50-672246 INFO Python version=3.10.6 platform=Windows
bin="C:\Users\zaxof\OneDrive\Documents\GitHub\nvidia_sdnext\venv\Scripts\python.exe"
venv="C:\Users\zaxof\OneDrive\Documents\GitHub\nvidia_sdnext\venv"
13:30:50-859782 INFO Version: app=sd.next updated=2024-09-10 hash=91bdd3b3 branch=dev
url=https://github.com/vladmandic/automatic.git/tree/dev ui=dev
13:30:51-186334 INFO Updating main repository
13:30:52-008006 INFO Upgraded to version: 91bdd3b Tue Sep 10 19:20:49 2024 +0300
13:30:52-015505 INFO Platform: arch=AMD64 cpu=AMD64 Family 25 Model 33 Stepping 2, AuthenticAMD system=Windows
release=Windows-10-10.0.22631-SP0 python=3.10.6
13:30:52-017006 DEBUG Setting environment tuning
13:30:52-018506 INFO HF cache folder: C:\Users\zaxof.cache\huggingface\hub
13:30:52-019506 DEBUG Torch allocator: "garbage_collection_threshold:0.80,max_split_size_mb:512"
13:30:52-026016 DEBUG Torch overrides: cuda=False rocm=False ipex=False diml=False openvino=False
13:30:52-027513 DEBUG Torch allowed: cuda=True rocm=True ipex=True diml=True openvino=True
13:30:52-037517 INFO nVidia CUDA toolkit detected: nvidia-smi present
Extensions : Extensions all: ['a1111-sd-webui-tagcomplete', 'adetailer', 'OneButtonPrompt',
'sd-civitai-browser-plus_fix', 'sd-webui-infinite-image-browsing', 'sd-webui-inpaint-anything',
'sd-webui-prompt-all-in-one']
Windows 11, RTX 3060 12gb, 5700x3d, 64gb ddr4, dev branch SDNEXT, firefox browser on desktop, chrome on android for remote access.
Relevant log output
No response
Backend
Diffusers
UI
Standard
Branch
Dev
Model
StableDiffusion XL
Acknowledgements
The text was updated successfully, but these errors were encountered: