Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📖 Airflow 3.0 - Initial Infrastructure Buildout for Airflow-Dev #6542

Open
4 tasks
Tracked by #6543
jhpyke opened this issue Jan 13, 2025 · 1 comment
Open
4 tasks
Tracked by #6543

📖 Airflow 3.0 - Initial Infrastructure Buildout for Airflow-Dev #6542

jhpyke opened this issue Jan 13, 2025 · 1 comment
Labels

Comments

@jhpyke
Copy link
Contributor

jhpyke commented Jan 13, 2025

User Story

As a maintainer of Airflow
I need a modern platform that can be easily maintained and updated
So that I can continue providing a high quality service.

Value / Purpose

Following on from #6096, we should start building out the infrastructure of our new Airflow Environments. I believe that the minimum amount of infra required should be:

  • An S3 bucket for holding DAGs and Config files (in Analytical-Platform-Compute)
  • A requirements file for Airflow 2.10.3
  • A VPC (may be shared/existing)
  • A Security Group with Inbound/Outbound rules (may be shared/existing)
  • A MWAA environment (mw1.medium)
  • 5 cloudwatch log groups (Task, Web Server, Scheduler, Worker, DAG Processing)
  • A Node Execution Role that is empowered enough to fetch images from required image repos.

Once we've built the MWAA environment, that then unlocks a number of strands of work (making images accessible, designing the DAG Factory/IAM Builder setup, Image Scanning and Monitoring etc.) as there will be the core infrastructure in place for other areas to reference.

Useful Contacts

@jhpyke, @jacobwoffenden

User Types

Airflow Users

Hypothesis

If we build a bare minimum MWAA environment targeting current airflow version
Then we will be able to build out the rest of the required infra.

Proposal

We should provision the resources as laid out in the agreed architecture diagram, as terraform in this repo. Once we have an environment we can upload DAGs to, that facilitates the effective work on and testing of many subsequent tickets.

Additional Information

No response

Definition of Done

  • We have a MWAA environment that is running in Analytical-Platform-Compute
  • We are able to execute a DAG within that environment
  • A Kubernetes pod is able to be scheduled from that environment
  • We are able to retrieve all relevant logs from cloudwatch
@jhpyke jhpyke added the story label Jan 13, 2025
@jhpyke jhpyke moved this from 👀 TODO to Backlog in Analytical Platform Jan 13, 2025
@jhpyke jhpyke changed the title 📖 Airflow 3.0 - Initial Infrastructure Buildout 📖 Airflow 3.0 - Initial Infrastructure Buildout for Airflow-Dev Jan 13, 2025
@jhpyke
Copy link
Contributor Author

jhpyke commented Jan 13, 2025

Much of the work captured in this ticket is in practice already completed, in the form of this PoC (Zephyr). Only needs mild cleanup to ensure functionality is fully in line with architectural diagram.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Backlog
Development

No branches or pull requests

1 participant