-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamically calculate PVC request sizes or remove, or document if it can be ignored #298
Comments
Hello! 👋 Before I even noticed this issue has been created I tried my hand at solving the problem and I'd like to get your thoughts on it. My fix is currently at the "proof of concept" level, it works as intended, but I need to add error handling, documentation and unit tests. Also I only added support for Huggingface models, I haven't looked into Ollama models yet. I did the following:
What I haven't done yet, but thought might be useful:
Any input or idea you might have would be greatly appreciated! 😁 |
@radu-catrangiu Thanks for taking the time to detail all of this. A few things that will make this more complicated:
Overall feedback: In KubeAI we strive to keep the system as simple as possible. KubeAI could be broken out into multiple services as it stands today, but we have decided to keep it as one monolith in order to avoid exposing microservice-operational-complexity to the admins that need to run it. My preference would be to use library calls over service calls. Being that this will take a good amount of code, I think a first simple step might be to expose a volume size parameter when configuring a cache profile. See: kubeai/charts/kubeai/values-gke.yaml Line 45 in 499a6ed
|
When using KubeAI caching, PVCs are created with a static storage request size of 10Gi. We have received feedback from multiple users who have been surprised by this and surprised that this size is not actually enforced when they use it.
kubeai/internal/modelcontroller/cache.go
Line 290 in 9d4fb2c
When it comes to Filestore / NFS / EFS I believe that this size is generally not an issue. In some cases I think this is because the minimum size of a dynamically provisioned Filestore instance is much larger than this. It might be disregarded in other cases. Need to do some more investigation across all implementations.
Possible options:
Links to user feedback:
The text was updated successfully, but these errors were encountered: