-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add leader election retry #66
Changes from 1 commit
75b2fd8
9a049f1
d8fdf7f
23f6904
fd8725a
e365dd7
4b2acd8
c149556
1dbb678
ba46c40
3f1d48d
7339da4
228920b
2405e3b
83c81ab
c711ae6
b8fc6b7
8236150
c63fd0e
e7cedfa
19a89af
6ad6446
38ff2ad
d4a3947
b056bce
ba9d1e0
985831e
610ba31
b7bd4cf
8bea0bf
9552e9e
d720074
fae4bab
af888e6
89b8a6e
3f6bab5
7310411
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,6 +10,7 @@ import ( | |
"k8s.io/client-go/kubernetes" | ||
"k8s.io/client-go/tools/leaderelection" | ||
"k8s.io/client-go/tools/leaderelection/resourcelock" | ||
"k8s.io/client-go/util/flowcontrol" | ||
) | ||
|
||
func NewElection(clientset kubernetes.Interface, id, namespace string) *Election { | ||
|
@@ -63,5 +64,27 @@ type Election struct { | |
} | ||
|
||
func (le *Election) Start(ctx context.Context) { | ||
leaderelection.RunOrDie(ctx, le.config) | ||
backoff := flowcontrol.NewBackOff(1*time.Second, 15*time.Second) | ||
const backoffID = "lingo-leader-election" | ||
retryCount := 0 | ||
for { | ||
select { | ||
case <-ctx.Done(): | ||
return | ||
default: | ||
if retryCount > 0 { | ||
backoff.Next(backoffID, backoff.Clock.Now()) | ||
delay := backoff.Get(backoffID) | ||
log.Printf("Leader election failed, retrying in %v. RetryCount: %v", delay, retryCount+1) | ||
select { | ||
case <-time.After(delay): | ||
case <-ctx.Done(): | ||
return | ||
} | ||
} | ||
log.Printf("Starting leader election process. RetryCount: %v", retryCount+1) | ||
leaderelection.RunOrDie(ctx, le.config) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the idea that RunOrDie eventually exits if it loses connection to the API Server? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's exactly what seems to end up happening. This Kong PR has more details: Kong/kubernetes-ingress-controller#578 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I double confirmed this by checking the logs and seeing how often it had to retry (re-run RunOrDie) when the apiserver is down for ~2 minutes |
||
retryCount++ | ||
} | ||
} | ||
samos123 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat, I didnt know about this library