You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 24, 2023. It is now read-only.
Now that we have the scheduling framework that supports multiple profiles, why can't we also add support for non-spark pods to utilize the scheduler extender's reservation objects so pods from all workloads/profiles can use their reservations to maintain universal cluster resource knowledge in an attempt to achieve multi-tenancy?
In short, all the different scheduler profiles share the extender and make use of the resource reservation feature using the resourceReservations CRD objects. The users can now use the same scheduler binary to gain maximum utilization of their cluster resources.
We can fork away from the repository to add this feature or have it implemented as a pluggable configuration.
The text was updated successfully, but these errors were encountered:
Hey @Gouthamkreddy1234 , I don't have resources to support non spark work loads unfortunately.
That being said, happy to accept any contributions.
Currently the extender relies on some labels to provide gang scheduling support for spark applications:
App id label, hardcoded here. The value of this label has to be the same for all pods of the same application.
Role label. For the driver it has have Driver as the value, and Executor for executors.
Apart from label values, there are some invariants for spark applications that might not hold true for other workloads:
Driver pod is submitted first, and only after it is scheduled executor pods are created.
Executor pods are recreated when they die
For correctness of scheduling decisions, all applications must follow the resource requests they have declared by annotations on the driver pod (num-executors, cpu and memory requests).
Hope this helps clarifying the contract between the extender and spark pods. If a distributed application followed these, I believe it should mostly work.
Yes, I am keeping these points in mind while I am work on this feature. I will continue to work on this contribution and keep you posted on the updates.
Regarding the implementation (as of now), for simple individual non-spark pods, I making sure they create reservations for themselves by mocking them to behave as drivers while keeping the app-id and role labels and setting the executor-count, resources to 0. This way, my the pod can create reservations required for itself so the pods (both Spark and non-Spark) that are scheduled in the future can take these resources reservations into account.
As a next step, I plan to slowly remove these dependencies for the non-Spark pods (annotations in particular can be replaced by something like spec.containers[i].resource.requests) and make necessary changes to the Predicate, CreateReservations, the binPackerFunc and all other dependent functions to handle this use-case.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Hi Team,
Now that we have the scheduling framework that supports multiple profiles, why can't we also add support for non-spark pods to utilize the scheduler extender's reservation objects so pods from all workloads/profiles can use their reservations to maintain universal cluster resource knowledge in an attempt to achieve multi-tenancy?
In short, all the different scheduler profiles share the extender and make use of the resource reservation feature using the resourceReservations CRD objects. The users can now use the same scheduler binary to gain maximum utilization of their cluster resources.
We can fork away from the repository to add this feature or have it implemented as a pluggable configuration.
The text was updated successfully, but these errors were encountered: