Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-28513 The StochasticLoadBalancer should support discrete evaluations #137

Open
wants to merge 1 commit into
base: HBASE-29073
Choose a base branch
from

Conversation

rmdmattingly
Copy link

I'll rebase this on apache/master once apache#6598, #135, and #136 have shipped

Big PR here: it adds the balancer conditional framework and our first conditional implementation: replica distribution. This is an improvement on existing cost-based replica distribution for reasons that I'll dig into further. See my design doc here.

You can enable conditional replica distribution via hbase.master.balancer.stochastic.conditionals.distributeReplicas: set this to true to enable the feature.

Improvements on Replica Balancing

Primary replica balancing squashes all other considerations. The default weight for one of the several cost functions that factor into primary replica balancing is 100,000. Meanwhile the default read request cost is 5. The result is that the load balancer, OOTB, basically doesn't care about balancing actual load. To solve this, you can either set primary replica balancing costs to zero, which is fine if you don't use read replicas, or — if you do use read replicas — maybe you can produce a magic incantation of configurations that work just right, until your needs change. Conditionals provide an alternative which works much more cleanly in relation to all of the other considerations that you would like your balancer to have.

Replica cost functions don't balance secondary replicas effectively. While they'll calculate imbalance costs necessary to balance primary replicas away from secondary replicas, there is no sufficient mechanism in the existing cost functions to distribute secondary replicas appropriately. So using >2 replicas on a table has a pretty dubious value proposition. On the other hand, this conditional implementation will ensure that secondary replicas are distributed to the greatest extent that the cluster allows. Even on a relatively tiny minicluster test I was unable to demonstrate that cost-based replica balancing could distribute a 3 replica table perfectly:
cf1
cf2

….omitting the meaningless snapshots between 4 and 27…

cf28

Meanwhile, conditional based replica balancing solved this imbalance effectively:
bc1
bc2
bc3
bc4
bc5

Testing

I've written a minicluster test to validate that conditional replica balancing works on a small cluster locally, and I've written a larger scale test that mocks the StochasticLoadBalancer in hbase-balancer. This test validates that conditional balancing performance is acceptable; even at a huge scale, even with default balancer costs (which other large scale cost-based replica balancing tests have had to compromise), and even with strict consideration for secondary replicas

cc @ndimiduk

@rmdmattingly rmdmattingly requested a review from ndimiduk January 16, 2025 16:14
@@ -28,6 +28,7 @@ enum Type {
ASSIGN_REGION,
MOVE_REGION,
SWAP_REGIONS,
MOVE_BATCH,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conditional candidate generators will look at the entire cluster state, and produce a series of moves that should move us towards compliance. To support this, I thought a "batch move" balance action made sense

@@ -711,6 +726,44 @@ enum LocalityType {
RACK
}

public List<RegionPlan> convertActionToPlans(BalanceAction action) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RegionPlans are much simpler than balance actions because they can represent the consequences of every type of balance action. With that in mind, I chose to have balancer conditionals only be concerned with RegionPlans so that their implementations could be simpler — but it means that I need to convert BalanceActions to region plans, without applying the plans, for evaluation against conditionals. Thus, this method needed to be added

@@ -905,6 +975,52 @@ int[] addRegion(int[] regions, int regionIndex) {
return newRegions;
}

int[] removeRegions(int[] regions, Set<Integer> regionIndicesToRemove) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method, and the below addRegions, are just a nicer way to add/remove regions from the BCS arrays in bulk

* from finding a solution.
*/
@InterfaceAudience.Private
final class BalancerConditionals implements Configurable {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This acts as a unified interface for interacting with whatever conditional(s) one might have enabled

Comment on lines +41 to +45
public enum ValidationLevel {
SERVER, // Just check server
HOST, // Check host and server
RACK // Check rack, host, and server
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 3 checks will be necessary for replica balancing, but for most conditionals (I envision a system table isolation conditional, for example) my guess is that only server validation will be necessary. So I figured that this is a nice way to let implementations be as simple as possible

* least-loaded servers.
*/
@InterfaceAudience.Private
final class SlopFixingCandidateGenerator extends RegionPlanConditionalCandidateGenerator {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very time-effective way to resolve the sloppy servers that may be produced by randomly distributing replicas across the cluster

@rmdmattingly rmdmattingly force-pushed the HBASE-28513-only-replicas branch 2 times, most recently from 475383b to ba9e1b5 Compare January 17, 2025 15:00
@rmdmattingly rmdmattingly force-pushed the HBASE-28513-only-replicas branch from ba9e1b5 to b6e3dfe Compare January 17, 2025 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant