-
Notifications
You must be signed in to change notification settings - Fork 649
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BnB a disk space hog #1129
Comments
I have a draft PR #1103 as a consideration to help slim this down, but need to conduct testing and validate it. There's also some discussion on this here: #1032 (comment) As of the latest v0.43.0 release, we dropped the shipped binaries to include only 11.7 - 12.3, but there's more work to be done. |
Yeah, we're working on slimming this down, but there's a clear trade-off is between ease of installation and disk space. Two main factors add to the volume, CUDA version support and the binaries being "fat binaries", i.e. that each binary for each CUDA version is much "fatter" as it includes the symbols for all compute capabilities. Both CUDA version and Compute Capability is something that is not detected by pip (please correct me if I'm wrong) and we can't therefore package different wheels while enabling a simple With Conda at least the detection of the CUDA installation seems possible, but this is quite the rabbit hole and tricky. We might look into that later. Anyways, when compiling from source you can pass CLI args to CMake and specify just the CUDA version and CC that you need for your installation and GPU model. This will give you a very reasonably sized binary. Another factor is that higher performance optimization when compiling gives larger binaries, partly due to inlining. We already chose only the second highest optimization setting, as a trade-off. #1103 that @matthewdouglas mentioned is trying to simplify things by making sure that we only need to compile for each major CUDA version, which would slim things down potentially only to two binaries. This still needs thorough review and testing though. Hopefully ready for the next release or the one thereafter. Anyone reading this, please let us know if you have any info that's not already mentioned here that could help us improve the status quo. |
Hmm, I wonder if this really needs to remain an open issue or if we could move this discussion to #1032 or a discussion in Github Discussions dev corner (I can transform the issue to that). Wdyt? |
as a temporary measure, is it safe for user to delete manually all non-relevant versions from
|
@poedator Yes, that should be safe to do. As @Titus-von-Koeller mentions, each compute capability that is included adds weight as we're shipping fat binaries compiled for >=Maxwell. Each of these seems to add ~2-3MB to the overall size. Here's what was shipped in v0.43.0:
In #1032 (comment) I had proposed that we drop CUDA < 11.7, and try to align better with PyTorch's binary distributions. CUDA 11.7, 11.8, and 12.1 matches us with their distributions from
I'm looking at #1126 to make sure we try to load the libraries that come with PyTorch first before falling back to searching for CUDA libraries elsewhere. The point there is again that we want it to be much easier to install, and have broad compatibility across platforms and hardware, so there's a balancing act. But if we get that right, it should mean we can drop down to just those CUDA versions shipped with PyTorch and require others be built from source. That potentially shaves half of the binaries away. Moving forward there are more options to explore, including:
|
Thanks @matthewdouglas for elaborating on all your current and upcoming work. Spelling out the details really helps in our shared understanding and getting other knowledgable people involved in fleshing out the tricky details! I'll engage on those topics with you more soon, once I have some other more urgent stuff out of the way. |
@poedator on the basis of the above discussion, where it looks like folks are moving the codebase in the right direction and have clarified that manual deletion is fine, are you happy for this issue to be closed? (I'm just trying to nudge down the total live-issue count on the basis that will improve contributor focus and bandwidth.) |
OK to close if this will get worked on in #1032 |
Yes, I'll keep your feedback in mind when addressing these topics in the coming weeks/months and we'll try to come up with a solution that's more space-saving. Thanks everyone for your collaborative spirit. Really appreciated. |
System Info
somehow BnB likes to bring with it libs for all possible cuda versions. It makes it the largest lib in my env after torch, with 300+ Mb disk use (in each env!). is this really necessary? Is there a magic install parameter to avoid this?
Below are the largest files in bnb folder in my env:
also compare it with GPTQ libs:
Reproduction
intall bnb with pip, check disk use
Expected behavior
taking much less space
The text was updated successfully, but these errors were encountered: