(CDK Garbage Collection): stack-scoped garbage collection #32799
Labels
effort/medium
Medium work item – several days of effort
feature-request
A feature should be added or improved.
p2
package/tools
Related to AWS CDK Tools or CLI
Describe the feature
We recently launched CDK Garbage Collection in CDK v2.165.0. This version of garbage collection is scoped to an individual environment (account + region) due to legacy constraints of the CDK Assets mechanism. With a modern CDK Assets, we can scope CDK Garbage Collection to each individual stack, and this will fit the mental model of CDK customers better. Additionally, it will fix a theoretical race condition that exists in CDK Garbage Collection today. See: https://github.com/aws/aws-cdk/tree/main/packages/aws-cdk#theoretical-race-condition-with-review_in_progress-stacks
Use Case
Customers who want to garbage collect assets that are managed by their CDK app, and disregard other stacks in the same account/region.
Proposed Solution
Background:
Garbage Collection was completed 10/25/2024 with the following design CDK Garbage Collection Design Doc. The main requirement for that design was that garbage collection would fit with the existing asset mechanism so that customers would be able to retroactively clean up their bootstrapped resources. While the initial Garbage Collection achieves exactly that, it comes with the following caveats:
Goal:
A better version of Garbage Collection would be one that can operate on a per-stack basis. This would have the benefit of being a much more contained scope for a delete operation.
Design:
We cannot achieve this with the current version of the asset mechanism because all assets are named via their content-based hash. This means that different stacks can share the same asset in the same environment. One stack not using a particular asset is not enough to say that the asset is isolated because other stacks could be referencing the same one.
A new asset upload mechanism would need to ensure each asset is uploaded with an identifier to the stack. That can look something like this:
/assets/MyStack/.zip
The complexity here would be that a) stacks can be renamed at deploy time, and b) nested stacks would need to be handled correctly.
For a), we would need to make sure that the stack identifier is unique to the stack and traceable back to the stack even if the stack name changes. For this, we can likely reuse template metadata to trace the name uploaded to the actual stack it represents.
For b), TBD
Migrating from old to new:
Customers migrating from the old asset mechanism would see all their assets reuploaded the first time, but there should be no problem beyond that.
Why should we do this?
This will result in a cleaner experience overall for both cdk gc and assets. In the past CDK has determined that the asset mechanism is an implementation detail but in practice customers are confused/concerned that assets are not separated out per-stack. This will align better with our customers’ understanding that CDK stacks are independent of each other. For cdk gc, the operation would take a trivial amount of time.
Why should we not do this?
We already have a system that improves on our bootstrap system, called the App Staging Synthesizer. The idea is to bootstrap resources per-stack to separate out bootstrap entirely. We can invest more in migrating customers to use that system that negates the need for garbage collection entirely.
Other Information
This may eventually be an RFC when we decide to pick this up. For now, if this is something you are interested, please 👍 this issue.
Acknowledgements
CDK version used
2.165.0
Environment details (OS name and version, etc.)
Mac
The text was updated successfully, but these errors were encountered: