-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support pause and resume reconciliation of a cluster #7435
base: main
Are you sure you want to change the base?
feat: support pause and resume reconciliation of a cluster #7435
Conversation
merge from upstream
…ter_pause_and_resume merge from main
} | ||
} | ||
|
||
if hasPaused { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hasPaused is calculated based on component objects, so it should set the dependencies for components to those CM objects explicitly.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #7435 +/- ##
==========================================
- Coverage 64.92% 61.38% -3.54%
==========================================
Files 345 437 +92
Lines 42942 52098 +9156
==========================================
+ Hits 27879 31982 +4103
- Misses 12619 17472 +4853
- Partials 2444 2644 +200
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
ready for review |
@@ -131,6 +131,8 @@ func (r *ClusterReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ct | |||
&clusterHaltTransformer{}, | |||
// handle cluster deletion | |||
&clusterDeletionTransformer{}, | |||
// handle cluster pause and resume | |||
&clusterPauseTransformer{}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the pause and resume operations be executed before all transformers? According to your design, can a cluster that is being deleted be paused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deletion has a higher priority than pause in my design, which is refer to the design of rollout pause of the k8s deployment.
support cluster pause and resume
Fixes [Features] Support Pause and Resume Reconcilation of a Cluster #6969
Pause Cluster, Component, and InstanSet
Annotate the cluster, cascade pause Components (by annotating) and InstanceSets (reuse the Paused field), after pausing, the three controllers will only handle delete operations.
Pause Reconfigure and Configuration
Asynchronous methods are not paused currently. (configconstraint.spec.reloadAction.*Trigger.sync = false)
For synchronous reconfiguration operations, sending ops and modifying config are performed by the configuration operator rendering the configmap, and the changes to the configmap are implemented in the engine via Reconfigure_controller.go, thus the Configuration and Reconfigure controllers need to be paused.
When the cluster resumes, modifying the corresponding configmap and configuration annotations will trigger a round of configuration tuning, and changes made during the pause will be applied.
Pause Backup
The backup operation aims to record the true status of the cluster, and after restoration from the backup, the cluster should serve exactly the same as before. The Spec of a paused cluster is different from its real status, and there is basically no way to fetch that information, thus Kubeblocks do not support backing up a paused cluster.
pause a cluster:
kubectl annotate cluster CLUSTER_NAME controller.kubeblocks.io/controller-paused="true"
resume a cluster:
kubectl annotate cluster CLUSTER_NAME controller.kubeblocks.io/controller-paused-