Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve] [bookie] Enable reorder read sequence for bk client by default #21652

Conversation

hangc0276
Copy link
Contributor

Fixes #xyz

Main Issue: #xyz

PIP: #xyz

Motivation

If one ledger's ensemble is [bk0, bk1] and bk0 is down, the bookie client may send a read request to bk0 first then fail with the following errors, and resend the read request to bk1 in the end.

2023-10-19T18:33:52,042 - ERROR - [BookKeeperClientWorker-OrderedExecutor-3-0:PerChannelBookieClient@563] - Cannot connect to 192.168.31.216:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId 192.168.31.216:3181, bookie does not exist or it is not running
2023-10-19T18:33:52,042 - INFO  - [BookKeeperClientWorker-OrderedExecutor-3-0:DefaultBookieAddressResolver@77] - Cannot resolve 192.168.31.216:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available
2023-10-19T18:33:52,042 - INFO  - [BookKeeperClientWorker-OrderedExecutor-3-0:PendingReadOp$LedgerEntryRequest@223] - Error: Bookie handle is not available while reading L6 E40 from bookie: 192.168.31.216:3181

One of the related issues is in the auto-recovery decommission and there is one PR in the BookKeeper repo: apache/bookkeeper#4113

However, the bookie client already knows the bk0 is down and we should send the read request to bk1 first. So we can reorder the read request based on the known bookie list. If one bookie is lost, it will reorder the lost bookie to the end of the read list.

Modifications

Enable the reorderReadSequence by default for bookie client in broker and auto-recovery.

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Dec 1, 2023
@hangc0276 hangc0276 self-assigned this Dec 1, 2023
@hangc0276 hangc0276 added area/config category/performance Performance issues fix or improvements and removed doc-not-needed Your PR changes do not impact docs labels Dec 1, 2023
@hangc0276 hangc0276 added this to the 3.2.0 milestone Dec 1, 2023
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Dec 1, 2023
@lhotari
Copy link
Member

lhotari commented Dec 1, 2023

What are the possible tradeoffs in using reorderReadSequence?

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@joeCarf joeCarf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hangc0276
Copy link
Contributor Author

hangc0276 commented Dec 4, 2023

What are the possible tradeoffs in using reorderReadSequence?

@lhotari It will add more load in reordering the read ensemble due to it is a high-frequency call for each entry reading. But it will be more effective when one bookie is down.

@codecov-commenter
Copy link

codecov-commenter commented Dec 4, 2023

Codecov Report

Merging #21652 (097a707) into master (575a484) will increase coverage by 0.18%.
Report is 7 commits behind head on master.
The diff coverage is 100.00%.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #21652      +/-   ##
============================================
+ Coverage     73.20%   73.39%   +0.18%     
+ Complexity    32744    32652      -92     
============================================
  Files          1893     1893              
  Lines        140730   140716      -14     
  Branches      15500    15499       -1     
============================================
+ Hits         103023   103279     +256     
+ Misses        29624    29328     -296     
- Partials       8083     8109      +26     
Flag Coverage Δ
inttests 24.15% <100.00%> (?)
systests 24.78% <100.00%> (+0.20%) ⬆️
unittests 72.66% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...org/apache/pulsar/broker/ServiceConfiguration.java 99.38% <100.00%> (ø)

... and 78 files with indirect coverage changes

@Technoboy- Technoboy- merged commit 0300019 into apache:master Dec 4, 2023
49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/config category/performance Performance issues fix or improvements doc-not-needed Your PR changes do not impact docs ready-to-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants