Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Task]: Build a cleaner for assets in the GCP test environment #33644

Open
1 of 17 tasks
pabloem opened this issue Jan 17, 2025 · 3 comments
Open
1 of 17 tasks

[Task]: Build a cleaner for assets in the GCP test environment #33644

pabloem opened this issue Jan 17, 2025 · 3 comments

Comments

@pabloem
Copy link
Member

pabloem commented Jan 17, 2025

What needs to happen?

Many tests and examples in the Beam codebase create GCP assets that are not cleaned up. Some examples:

  • BQ datasets
  • Pubsub topics, subscriptions
  • BigTable instances?
  • Others?

The task is to build automation that can safely disable-then-clean-up leftover assets from the GCP environment. The tool would:

  • List existing inventory of various kinds of assets
  • Figure out which ones are meant to be cleaned up (i.e. short-lived test assets, vs. long-lived project assets like base sources, base buckets, etc).
  • Wait a week or so
  • Disable and (eventually?) delete these assets

Why?

  • A lot of these assets have billing implications for the project
  • A lot of these assets make navigating the project more difficult (e.g. reviewing Bigquery datasets)

Risks:

  • We need to roll this out carefully to make sure we don't delete important assets. We'd probably start with strict lists of inclusions (i.e. everything that's not included will be excluded).

Issue Priority

Priority: 2 (default / most normal work should be filed as P2)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@pabloem
Copy link
Member Author

pabloem commented Jan 17, 2025

Sample of this issue: #19760

@Abacn
Copy link
Contributor

Abacn commented Jan 23, 2025

In Beam repo's CI/CD, there is a workflow, runs daily, to clean up the resources:

https://github.com/apache/beam/blob/master/.github/workflows/beam_CleanUpGCPResources.yml

though this does not cover the case if test is running in a fork.

@liferoad
Copy link
Contributor

cc @Amar3tto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants