-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable reorder read sequence for bk client by default #4139
Enable reorder read sequence for bk client by default #4139
Conversation
@@ -1057,6 +1057,9 @@ zkEnableSecurity=false | |||
# Enable/disable having read operations for a ledger to be sticky to a single bookie. | |||
stickyReadSEnabled=true | |||
|
|||
# Enable/disable reordering read sequence on reading entries. | |||
reorderReadSequenceEnabled=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change the default value in the clientconfiguration.java? I think it should be applied to all the clients, right? It will avoid reading a lost bookie as much as possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. We can change the default value in the next major release.
ac4777c
to
0668143
Compare
rerun failure checks |
1 similar comment
rerun failure checks |
### Motivation <!-- Explain here the context, and why you're making that change. What is the problem you're trying to solve. --> If one ledger's ensemble is [bk0, bk1] and bk0 is down, the bookie client may send a read request to bk0 first then fail with the following errors, and resend the read request to bk1 in the end. ``` 2023-10-19T18:33:52,042 - ERROR - [BookKeeperClientWorker-OrderedExecutor-3-0:PerChannelBookieClient@563] - Cannot connect to 192.168.31.216:3181 as endpoint resolution failed (probably bookie is down) err org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: Cannot resolve bookieId 192.168.31.216:3181, bookie does not exist or it is not running 2023-10-19T18:33:52,042 - INFO - [BookKeeperClientWorker-OrderedExecutor-3-0:DefaultBookieAddressResolver@77] - Cannot resolve 192.168.31.216:3181, bookie is unknown org.apache.bookkeeper.client.BKException$BKBookieHandleNotAvailableException: Bookie handle is not available 2023-10-19T18:33:52,042 - INFO - [BookKeeperClientWorker-OrderedExecutor-3-0:PendingReadOp$LedgerEntryRequest@223] - Error: Bookie handle is not available while reading L6 E40 from bookie: 192.168.31.216:3181 ``` One of the related issues is in the auto-recovery decommission and there is one PR in the BookKeeper repo: apache#4113 However, the bookie client already knows the bk0 is down and we should send the read request to bk1 first. So we can reorder the read request based on the known bookie list. If one bookie is lost, it will reorder the lost bookie to the end of the read list. ### Modifications <!-- Describe the modifications you've done. --> Enable the `reorderReadSequence` by default for auto-recovery.
Motivation
If one ledger's ensemble is [bk0, bk1] and bk0 is down, the bookie client may send a read request to bk0 first then fail with the following errors, and resend the read request to bk1 in the end.
One of the related issues is in the auto-recovery decommission and there is one PR in the BookKeeper repo: #4113
However, the bookie client already knows the bk0 is down and we should send the read request to bk1 first. So we can reorder the read request based on the known bookie list. If one bookie is lost, it will reorder the lost bookie to the end of the read list.
Modifications
Enable the
reorderReadSequence
by default for auto-recovery.