Differentiate between GCRA burst capacity and tolerance #251

jonasmalacofilho · 2024-12-08T02:41:14Z

In GCRA terms, the total capacity of the rate limiter is Τ + τ, where Τ is the nominal cell arrival interval and τ is (extra) tolerance allowed for bursts.

The current implementation has a subtle issue, caused by conflating capacity and tolerance, which results in one extra cell being allowed after arrival intervals equal or greater than "capacity + 1", as reported in #107 and #249. The problem was partially hidden in other cases by a compensating addition of Τ to initial TAT in the case of the first cell (i.e. when the rate limiter has no state recorded for it).

To fix the issue, correct the value tau used for making the decision, making it the tolerance and not the capacity, and remove the compensation for the first cell that's no longer necessary.

This PR also adds a new specific test for the "capacity + 1" interval scenario, as well as fixes an existing test that incorrectly expected one of these extra cells. Finally, some needless lifetimes reported by clippy are also removed since the project opts for #[deny(warnings)].

Fixes: #107
Fixes: #249

A rate limiter constructed with burst capacity 2 should never allow more than 2 cells in a short burst.

In GCRA terms, the total capacity of the rate limiter is `Τ + τ`, where `Τ` is the nominal cell arrival interval and `τ` is (extra) tolerance allowed for bursts. The current implementation has a subtle issue, caused by conflating capacity and tolerance, which results in one extra cell being allowed after arrival intervals equal or greater than "capacity + 1", as reported in boinkor-net#107 and boinkor-net#249. This problem was partially hidden by a compensating addition of Τ to initial TAT in the case of the first cell (i.e. when the rate limiter has no state recorded for it). To fix the issue, correct the value `tau` used for making the decision, making it the tolerance, and remove the adjustment for the first cell that's no longer necessary. Fixes: boinkor-net#107 Fixes: boinkor-net#249

jonasmalacofilho · 2024-12-08T17:05:40Z

The cargo deny failure is unrelated to these changes, but I can allow-list the "Unicode-3.0" license for unicode-ident, if acceptable to you.

Relevant except from the cargo deny run log:

error[rejected]: failed to satisfy license requirements
  ┌─ unicode-ident 1.0.14 (registry+https://github.com/rust-lang/crates.io-index):4:36
  │
4 │ license = "(MIT OR Apache-2.0) AND Unicode-3.0"
  │            ------------------------^^^^^^^^^^^
  │            │                       │
  │            │                       rejected: not explicitly allowed
  │            license expression retrieved via Cargo.toml `license`
  │
  = Unicode-3.0 - Unicode License v3:
  =   - OSI approved
  = unicode-ident v1.0.14
    ├── proc-macro2 v1.0.92
    │   ├── quote v1.0.37
    │   │   ├── serde_derive v1.0.215
    │   │   │   ├── criterion v0.5.1
    │   │   │   │   └── (dev) governor v0.7.0
    │   │   │   └── serde v1.0.215
    │   │   │       ├── ciborium v0.2.2
    │   │   │       │   └── criterion v0.5.1 (*)
    │   │   │       ├── criterion v0.5.1 (*)
    │   │   │       ├── serde_json v1.0.133
    │   │   │       │   ├── criterion v0.5.1 (*)
    │   │   │       │   └── tinytemplate v1.2.1
    │   │   │       │       └── criterion v0.5.1 (*)
    │   │   │       └── tinytemplate v1.2.1 (*)
    │   │   ├── syn v2.0.90
    │   │   │   ├── serde_derive v1.0.215 (*)
    │   │   │   ├── wasm-bindgen-backend v0.2.99
    │   │   │   │   └── wasm-bindgen-macro-support v0.2.99
    │   │   │   │       └── wasm-bindgen-macro v0.2.99
    │   │   │   │           └── wasm-bindgen v0.2.99
    │   │   │   │               ├── js-sys v0.3.[76](https://github.com/boinkor-net/governor/actions/runs/12218164850/job/34093914103#step:5:77)
    │   │   │   │               │   └── web-sys v0.3.76
    │   │   │   │               │       ├── plotters v0.3.7
    │   │   │   │               │       │   └── criterion v0.5.1 (*)
    │   │   │   │               │       └── quanta v0.12.3
    │   │   │   │               │           └── governor v0.7.0 (*)
    │   │   │   │               ├── plotters v0.3.7 (*)
    │   │   │   │               └── web-sys v0.3.76 (*)
    │   │   │   ├── wasm-bindgen-macro-support v0.2.99 (*)
    │   │   │   └── zerocopy-derive v0.7.35
    │   │   │       └── zerocopy v0.7.35
    │   │   │           └── ppv-lite[86](https://github.com/boinkor-net/governor/actions/runs/12218164850/job/34093914103#step:5:87) v0.2.20
    │   │   │               └── rand_chacha v0.3.1
    │   │   │                   ├── proptest v1.5.0
    │   │   │                   │   └── (dev) governor v0.7.0 (*)
    │   │   │                   └── rand v0.8.5
    │   │   │                       ├── governor v0.7.0 (*)
    │   │   │                       └── proptest v1.5.0 (*)
    │   │   ├── wasm-bindgen-backend v0.2.99 (*)
    │   │   ├── wasm-bindgen-macro v0.2.99 (*)
    │   │   ├── wasm-bindgen-macro-support v0.2.99 (*)
    │   │   └── zerocopy-derive v0.7.35 (*)
    │   ├── serde_derive v1.0.215 (*)
    │   ├── syn v2.0.[90](https://github.com/boinkor-net/governor/actions/runs/12218164850/job/34093914103#step:5:91) (*)
    │   ├── wasm-bindgen-backend v0.2.99 (*)
    │   ├── wasm-bindgen-macro-support v0.2.99 (*)
    │   └── zerocopy-derive v0.7.35 (*)
    └── syn v2.0.90 (*)

antifuchs · 2024-12-08T17:09:43Z

Thanks for the contribution - your explanation and the change seems really good on first glance (haven't read through all of it yet, but soon)!

I'm OK with allowlisting the unicode license - it's a bit annoying that ~every project needs to add that specific one combination of licenses to its list, but oh well (:

antifuchs

This looks great. I have one open/question that I'm sure you thought of already (and that my morning brain just can't reason through fully yet), but otherwise I'm super excited to include this and fix all those bugs (:

antifuchs · 2024-12-09T14:08:25Z

governor/src/gcra.rs

-        .into();
-        let t: Nanos = quota.replenish_1_per.into();
+        let t: Nanos = cmp::max(quota.replenish_1_per, Duration::from_nanos(1)).into();
+        let tau: Nanos = t * (quota.max_burst.get() - 1).into();


Hm, hm, hm. I think this might cause tau to become 0 if you pick a burst capacity of 1... I'm slightly convinced that won't cause problems in the tat calculation below, but not fully certain. a tau=0 case is fine, right?

Yes, it will cause tau to be zero, intentionally. If the burst capacity is one, we want to reject any cells that arrive before the TAT. Only with capacities greater than one that we want to allow for non-zero tolerance tau.

The TAT is (ignoring the max operation) the time the last cell arrived + the nominal interval t. So if tau was greater than zero for a burst capacity of one, it would cause us to allow a cell to be accepted before the nominal interval has passed since the last cell, effectively accepting a burst of two and violating the burst capacity.

Oh, I forgot to answer your actual question: yes, tau=0 should be fine in the decision calculations.

What I'm less sure about is if I adjusted all places where tau is used outside of the gcra module... in fact, let me take another look today and see if I didn't miss anything the tests also didn't catch.

I'm glad I did another pass, as I found an issue in gcra::test_n_all_and_update. Fixed it and adjusted the relevant test case to cover that case too.

The second GCRA parameter is the tolerance `tau`. The capacity of the bucket is `t + tau`. The `tau` parameter was accidentally renamed to `capacity` in 273f77b as left over changes from a previous experiment. Fixes: 273f77b ("Differentiate between GCRA burst capacity and tolerance")

Since 273f77b, both relate to the cells *in addition* to the first cell. Fixes: 273f77b ("Differentiate between GCRA burst capacity and tolerance")

antifuchs

That looks really good. I'm extra glad that this makes the additional_weight handling easier!

jonasmalacofilho · 2024-12-10T13:25:16Z

Thanks!

antifuchs · 2024-12-10T13:26:17Z

Thank you! I'll cut a release containing this soon & will credit you in the changelog (:

jonasmalacofilho added 4 commits December 7, 2024 20:54

Add failing test for boinkor-net#107 and boinkor-net#249

6a1bd13

Fix quota tracking test that allowed one more cell than capacity

e875cbf

A rate limiter constructed with burst capacity 2 should never allow more than 2 cells in a short burst.

clippy: remove some needless lifetimes

9075015

deny: allow-list Unicode-3.0 license used by unicode-ident

237a4d4

antifuchs approved these changes Dec 9, 2024

View reviewed changes

jonasmalacofilho force-pushed the dont-allow-extra-cell-after-longer-interval branch from fe0a3dc to a527973 Compare December 9, 2024 18:49

jonasmalacofilho added 2 commits December 9, 2024 15:53

Remove extra cell when comparing additional weight and tau

d4685a3

Since 273f77b, both relate to the cells *in addition* to the first cell. Fixes: 273f77b ("Differentiate between GCRA burst capacity and tolerance")

jonasmalacofilho force-pushed the dont-allow-extra-cell-after-longer-interval branch from a527973 to d4685a3 Compare December 9, 2024 18:56

antifuchs approved these changes Dec 10, 2024

View reviewed changes

antifuchs added this pull request to the merge queue Dec 10, 2024

Merged via the queue into boinkor-net:master with commit a028daf Dec 10, 2024
16 checks passed

antifuchs added a commit that referenced this pull request Dec 10, 2024

Changelog entry for #251

af2aab6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Differentiate between GCRA burst capacity and tolerance #251

Differentiate between GCRA burst capacity and tolerance #251

jonasmalacofilho commented Dec 8, 2024

jonasmalacofilho commented Dec 8, 2024

antifuchs commented Dec 8, 2024

antifuchs left a comment

antifuchs Dec 9, 2024

jonasmalacofilho Dec 9, 2024 •

edited

Loading

jonasmalacofilho Dec 9, 2024 •

edited

Loading

jonasmalacofilho Dec 9, 2024

antifuchs left a comment

jonasmalacofilho commented Dec 10, 2024

antifuchs commented Dec 10, 2024

Differentiate between GCRA burst capacity and tolerance #251

Differentiate between GCRA burst capacity and tolerance #251

Conversation

jonasmalacofilho commented Dec 8, 2024

jonasmalacofilho commented Dec 8, 2024

antifuchs commented Dec 8, 2024

antifuchs left a comment

Choose a reason for hiding this comment

antifuchs Dec 9, 2024

Choose a reason for hiding this comment

jonasmalacofilho Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

jonasmalacofilho Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

jonasmalacofilho Dec 9, 2024

Choose a reason for hiding this comment

antifuchs left a comment

Choose a reason for hiding this comment

jonasmalacofilho commented Dec 10, 2024

antifuchs commented Dec 10, 2024

jonasmalacofilho Dec 9, 2024 •

edited

Loading

jonasmalacofilho Dec 9, 2024 •

edited

Loading