-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow down consensus (increase timeouts) when vertex store is close to being full #859
base: feature/vertex-store-overflow-mitigations
Are you sure you want to change the base?
Conversation
Docker tags |
|
||
// It should already be in the [0, 1] range, but we're nonetheless sanitizing the input | ||
final var vertexStoreUtilizationRatioClamped = | ||
Math.max(0, Math.min(1, vertexStoreUtilizationRatio)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guava has the clamp function, with a conveniently overcomplicated name: Doubles.constrainToRange(vertexStoreUtilizationRatio, 0, 1)
final var multiplier = | ||
Math.max( | ||
1, // Multiplier is 1 (i.e. no-op) if we're below the threshold | ||
lerp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
guess what... Guava has the lerp function, with a cool (if not too-fluent) style: LinearTransformation.mapping(0.66, 1.0).and(1.0, 10.0).transform(ratio);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
guava has it all 😄
public @interface AdditionalRoundTimeIfProposalReceivedMs {} | ||
public PacemakerTimeoutCalculatorConfig { | ||
Preconditions.checkArgument( | ||
baseTimeoutMs > 0, "timeoutMs must be > 0 but was " + baseTimeoutMs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nitpick) this supports formatting (%s
).
…/slow-down-consensus-when-vertex-store-full
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I only have one non-critical remark 👍
new PacemakerTimeoutCalculatorConfig(baseTimeout, 2.0, 0, 0L, 0.6, 10)); | ||
|
||
// spotless:off | ||
final Map<Double, Long> testCases = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) There is a lightweight way for test parameterization: @RunWith(JUnitParamsRunner.class)
(do not confuse with the lame @RunWith(Parameterized.class)
!)
(see e.g.
@Parameters(method = "unsolicitedSyncResponseExceptions") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice, thanks
bind(PacemakerTimeoutCalculatorConfig.class) | ||
.toInstance(new PacemakerTimeoutCalculatorConfig(3000L, 1.2, 8, 30_000L, 0.66, 10)); | ||
|
||
// Delayed resolution is disabled for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might it be worth explaining why in this comment? (i.e. because we cannot create QCs on fallback vertices, because we don't just sign the ledger header, but also the BFT header, which captures the previous certificate chain, and all the nodes have a different certificate chain for their fallback vertex)
(double) -1, baseTimeout, | ||
0.5, baseTimeout, | ||
0.6, baseTimeout, | ||
0.61, 1225L, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's slightly weird to me that half of these are in terms of baseTimeout and half of them are absolute numbers - can we standardize?
we start multiplying the timeout by a linearly increasing value, up to 10x. | ||
So a maximum theoretical timeout is 130s. */ | ||
bind(PacemakerTimeoutCalculatorConfig.class) | ||
.toInstance(new PacemakerTimeoutCalculatorConfig(3000L, 1.2, 8, 30_000L, 0.66, 10)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm finding it a little hard to read this - because all of these don't have names any more.
Perhaps we should have a PacemakerTimeoutCalculatorConfig::default()
and a PacemakerTimeoutCalculatorConfig::testing()
? And inside those methods, we can label the values with variable names, and then pass the variables into the constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For review purposes, these are:
long baseTimeoutMs: 3 000
double consecutiveTimeoutSlowdownRate: 1.2
int consecutiveTimeoutMaxExponent: 8 // 1.2^8 = 4.29981696
long additionalRoundTimeIfProposalReceivedMs: 30 000
double vertexStoreMultiplierThreshold: 0.66
double maxVertexStoreMultiplier: 10
|
||
// We're only applying the multiplier if vertexStoreUtilizationRatio is | ||
// on or above vertexStoreMultiplierThreshold: we're translating from | ||
// range [vertexStoreMultiplierThreshold, 1] to [1, maxVertexStoreMultiplier]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was partially expecting to see an exponential here
(So roughly take linearly map: [vertexStoreMultiplierThreshold, 1]
to exponent: [0, maxVertexStoreExponent]
and then return Math.round(config.baseTimeoutMs() * timeoutExponential * vertexSizeExponential);
)
But I think linear is fine too and does its job.
import javax.inject.Qualifier; | ||
public record PacemakerTimeoutCalculatorConfig( | ||
long baseTimeoutMs, | ||
double rate, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should rename these to e.g. consecutiveTimeoutSlowdownRate
, consecutiveTimeoutMaxExponent
to make clear they just apply to that.
No description provided.