You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As a maintainer of the Airflow Platform
I need a repo users can sync their DAGs and roles from that is Terraform Driven and designed around our new DAGFactory Functionality
So that users are able to create new DAGs to be scheduled in the new MWAA environment in the easiest and most effective way possible.
Value / Purpose
In order to modernise how we manage airflow, we need a repo that allows them to easily understand how to create new DAGs using DAG factory, provide them with a role using IAM_builder that has sufficient permissions to do any tasks they require, and then have that repo sync those to our MWAA environments. To do this, the repo will need to use terraform to create the required roles in analytical-platform-data-production with a role policy that allows assumption from the relevant compute environment. The desired output is already available in our existing airflow repo, but this relies on pulumi to manage and provision these roles, which is not a technology our team wishes to support.
If we build a repo that simplifies the process of creating a new DAG
Then airflow as a platform will be more approachable for users.
Proposal
The repo should present users with a a template DAGFactory DAG for them to fill out with their specific needs. These DAGs should then be synced to the S3 bucket of the relevant MWAA environment by terraform or awscli. Terraform should be used to provision the roles. Where roles already exist, we should transition them to terraform management when we are ready to migrate users from the old repo. Users who have a complex prexisting DAG that they do not wish to rewrite in DAG Factory format should be able to provide a full python DAG file if they wish to do so, but we should prompt users to use DAGFactory unless they have a great reason not to.
For the roles, it is worth noting that you will need a mechanism to provision a ServiceAccount in the relevant EKS cluster that can be used by the running pod for IRSA. This should be a solved problem, as we've already implemented this for the existing Airflow repo.
Additional Information
No response
Definition of Done
Repo is created
Repo can store and upload DAGs to S3
Repo can create roles in Data-Production that can be assumed by existing Airflow EKS Clusters
Repo can create service accounts for IRSA
Repo can run validation on roles/DAGs to ensure basic standards are met
The text was updated successfully, but these errors were encountered:
User Story
As a maintainer of the Airflow Platform
I need a repo users can sync their DAGs and roles from that is Terraform Driven and designed around our new DAGFactory Functionality
So that users are able to create new DAGs to be scheduled in the new MWAA environment in the easiest and most effective way possible.
Value / Purpose
In order to modernise how we manage airflow, we need a repo that allows them to easily understand how to create new DAGs using DAG factory, provide them with a role using IAM_builder that has sufficient permissions to do any tasks they require, and then have that repo sync those to our MWAA environments. To do this, the repo will need to use terraform to create the required roles in
analytical-platform-data-production
with a role policy that allows assumption from the relevant compute environment. The desired output is already available in our existing airflow repo, but this relies on pulumi to manage and provision these roles, which is not a technology our team wishes to support.Useful Contacts
@jhpyke
User Types
No response
Hypothesis
If we build a repo that simplifies the process of creating a new DAG
Then airflow as a platform will be more approachable for users.
Proposal
The repo should present users with a a template DAGFactory DAG for them to fill out with their specific needs. These DAGs should then be synced to the S3 bucket of the relevant MWAA environment by terraform or
awscli
. Terraform should be used to provision the roles. Where roles already exist, we should transition them to terraform management when we are ready to migrate users from the old repo. Users who have a complex prexisting DAG that they do not wish to rewrite in DAG Factory format should be able to provide a full python DAG file if they wish to do so, but we should prompt users to use DAGFactory unless they have a great reason not to.For the roles, it is worth noting that you will need a mechanism to provision a
ServiceAccount
in the relevant EKS cluster that can be used by the running pod for IRSA. This should be a solved problem, as we've already implemented this for the existing Airflow repo.Additional Information
No response
Definition of Done
The text was updated successfully, but these errors were encountered: