-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SNOW-90] Introduce CI job to test schemachange
updates against a clone
#78
base: dev
Are you sure you want to change the base?
Conversation
@philerooski / @jaymedina , I think there should be another ticket to fully determine whether the DATA_ENGINEER role should just be the OWNER of Similarly, since the ^ That said, I'm not tied to that, and I'll let Phil decide when we get to that design ticket. |
snow sql -q "GRANT OWNERSHIP ON ALL TABLES IN SCHEMA ${SNOWFLAKE_SYNAPSE_DATA_WAREHOUSE_DATABASE}.SYNAPSE TO ROLE DATA_ENGINEER REVOKE CURRENT GRANTS;" | ||
|
||
# Transfer ownership of: dynamic tables | ||
snow sql -q "GRANT OWNERSHIP ON ALL DYNAMIC TABLES IN SCHEMA ${SNOWFLAKE_SYNAPSE_DATA_WAREHOUSE_DATABASE}.SYNAPSE TO ROLE DATA_ENGINEER REVOKE CURRENT GRANTS;" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to COPY CURRENT GRANTS rather than REVOKE so that we can imitate the source object as closely as possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially, I would say no, mainly because the only role that should be interacting with the clone and its objects is DATA_ENGINEER
for the purpose of testing and validation anyway. Also because COPY CURRENT GRANTS
would copy over the DATA_ENGINEER
s previous grants before it became owner, so it would be redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering how we might test changes to privileges. If we've already revoked current grants on an object, then how can we check that whatever changes we've made to privileges granted on an object reflect what we are expecting?
For example:
object_dev
- (ownership privilege)
- privilege a
- privilege b
- privilege c
object_clone
(before schemachange)
- (ownership privilege)
object_clone
(after schemachange)
- (ownership privilege)
- privilege d
Assuming that schemachange applied some arbitrary change set to the privileges (like adding privilege d, revoking some other privileges), how can we test grants on object_dev
+ our change set
= grants we expect on this object
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point - If we plan to add tests for updates in object privileges, then I agree we should not revoke any grants. I'll do a test to make sure COPY CURRENT GRANTS
doesn't override DATA_ENGINEER
s new ownership, and make a commit for this. Thanks!
Something to note: The clone is built off the latest |
Just to add the |
From the conversation here, we can clone off of the |
@jaymedina let's hold off on making changes until we can get all get our bearings on this. Ideally the The point of cloning DEV is so that the queries run faster due to it having less data (but similar structure) as the prod data |
Quality Gate passedIssues Measures |
drop_clone: | ||
runs-on: ubuntu-latest | ||
if: github.event.pull_request.merged == true || github.event.action == 'closed' | ||
environment: dev | ||
env: | ||
SNOWFLAKE_PASSWORD: ${{ secrets.SNOWSQL_PWD }} | ||
SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWSQL_ACCOUNT }} | ||
SNOWFLAKE_USER: ${{ secrets.SNOWSQL_USER }} | ||
SNOWFLAKE_WAREHOUSE: ${{ secrets.SNOWSQL_WAREHOUSE }} | ||
SNOWFLAKE_CLONE_ROLE: DATA_ENGINEER | ||
CLONE_NAME: "${{ vars.SNOWFLAKE_SYNAPSE_DATA_WAREHOUSE_DATABASE }}_${{ github.head_ref }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am understanding this will only drop the clone when the PR is merged or closed. However, a cloned DB is created on any push when the PR is labeled and open.
Would this mean that only the last github.head_ref
is cleaned up? In the end we would want to cleanup both the head_ref
when things are merged, AND, when a new push is added (and head_ref is updated) we'd also want to drop the previous cloned DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this mean that only the last github.head_ref is cleaned up?
Good question - Yes, this means that the last github.head_ref
is cleaned up. However, this is not a concern because it's the only cloned DB on snowflake, for a given feature branch. All predecessor clones just get replaced by a new clone.
So for example if head_ref
is snow-90-my-branch
, the procedure goes like this:
- an initial commit is made + job is triggered
- new clone is made named:
synapse_data_warehouse_dev_snow_90_my_branch
- schemachange runs
- another commit is made
- old clone is replaced by the new clone of
synapse_data_warehouse_dev
with aCREATE OR REPLACE
. the new clone has the same name as the last clone, so it does aREPLACE
- schemachange runs
- rinse and repeat
- when it's time to merge, the clone is dropped (if it still exists). the drop clone job assumes the name hasn't been changed since its initial create, so this is a potential edge case actually, but it's probably enough to just say "don't change the clone name" in the CONTRIBUTIONS doc
problem
Rather than have it be a manual process, automating the testing of
schemachange
updates in our branch before merging it intodev
will speed up the development life cycle. A design document was made to tackle this problem and is available here.solution
test_with_clone
, to trigger the automated process of zero-copy cloningsynapse_data_warehouse_dev
and testing new/modified schemachange scripts against ittesting
create_clone_and_run_schemachange
triggers the CI job to rundev
db and not theprod
dbdrop_clone
to runcreate_clone_and_run_schemachange
to runcreate_clone_and_run_schemachange
does not trigger the CI job to run