You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We had two stacks, one with 100% weight, and a broken one (failed deployment) with 0% weight.
After deploying a third stack, we (our CD system) was switching traffic to it:
13:35:56.202 Running: /tools/run registry.opensource.zalan.do/stups/toolchain-stups:22 -- senza traffic purchase-orders-management.yaml 201904041320 100 --region eu-central-1
13:35:59.030 Calculating new weights.. OK
13:35:59.031 Stack Name │Version │Identifier │Old Weight%│Delta │Compensation│New Weight%│Current
13:35:59.031 purchase-orders-management purchase-orders-management-201904031151 0.0 0.0
13:35:59.031 purchase-orders-management 201903281417 purchase-orders-management-201903281417 100.0 -100.0 0.0
13:35:59.031 purchase-orders-management 201904041320 purchase-orders-management-201904041320 0.0 100.0 100.0 <
13:36:01.074 Setting weights for purchase-orders-management.goodbuy.zalan.do...Validation Error: Stack:arn:aws:cloudformation:eu-central-1:383379053614:stack/purchase-orders-management-201904031151/0ecefee0-56ca-11e9-99be-026d43bbed96 is in CREATE_FAILED state and can not be updated.
So the traffic switching failed because of the broken stack. So far, so good.
Problem
But when looking at the setting later, it looked like that:
$ senza traffic purchase-orders-management
Stack Name │Version │Identifier │Weight%
purchase-orders-management purchase-orders-management-201904031151 0.0
purchase-orders-management 201903281417 purchase-orders-management-201903281417 0.0
purchase-orders-management 201904041320 purchase-orders-management-201904041320 0.0
So now all stacks (including the broken one) had a weight of 0.0. That is definitely not correct.
Guess on what happened
Looking into the code of senza traffic, it looks like the command computes the new percentages (and displays them, as we can see), and then goes through them one-by-one, issuing the API call to change the weights. As soon as one of them fails, the whole command stops.
This here seems to have the effect that first version 201903281417 is set to 0, then the broken stack is tried to update (which fails), and the setting of 201904041320 to 100 is not even tried.
What should happen
When switching the traffic, the weight-increasing of some instances should be done before decreasing the weight of other instances.
The text was updated successfully, but these errors were encountered:
Background
We had two stacks, one with 100% weight, and a broken one (failed deployment) with 0% weight.
After deploying a third stack, we (our CD system) was switching traffic to it:
So the traffic switching failed because of the broken stack. So far, so good.
Problem
But when looking at the setting later, it looked like that:
So now all stacks (including the broken one) had a weight of 0.0. That is definitely not correct.
Guess on what happened
Looking into the code of senza traffic, it looks like the command computes the new percentages (and displays them, as we can see), and then goes through them one-by-one, issuing the API call to change the weights. As soon as one of them fails, the whole command stops.
This here seems to have the effect that first version 201903281417 is set to 0, then the broken stack is tried to update (which fails), and the setting of 201904041320 to 100 is not even tried.
What should happen
When switching the traffic, the weight-increasing of some instances should be done before decreasing the weight of other instances.
The text was updated successfully, but these errors were encountered: