Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create-a-derived-table update incremental models guidance #357

Open
tamsinforbes opened this issue Jul 17, 2023 · 0 comments
Open

create-a-derived-table update incremental models guidance #357

tamsinforbes opened this issue Jul 17, 2023 · 0 comments
Labels
Data Platform Core Infrastructure This issue is owned by Data Platform Core Infrastructure

Comments

@tamsinforbes
Copy link
Contributor

To ensure tables are properly cleaned up when either the insert_overwrite or append incremental strategies are defined we need to better understand what the expected behaviour of these strategies is.

I.e., should a partition always be be defined, should there be logic to only clean up tables when the --full-refresh option is used, etc.?

Investigate other related behaviour such as

  • + prefix
    • If models m1, m2, m3, have the same ancestors does dbt run --select +m1 +m2 +m2 rerun those ancestor models 3 times?
      • No it doesn’t - if the depends list overlaps it only runs a unique set; try dbt run --select +m1 +m1 it still only runs a unique set
    • does dbt run --select +m1 include running the seeds that m1 depends on as well as model ancestors.
      • No - seeds must be deployed with the dbt seed command before the models that depend on them are run with dbt run
  • seeds how to have versioned seeds, eg lookup_offence
    • just have a column for version?
      • yes - seed will grow linearly but preserves ease of access for users to previous versions
    • or a partition?
      • yes - also partition by release to make performant when users filter by release

Done when

further incrementals training/guidance is added to user guidance

@tamsinforbes tamsinforbes added the Data Platform Core Infrastructure This issue is owned by Data Platform Core Infrastructure label Jul 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Data Platform Core Infrastructure This issue is owned by Data Platform Core Infrastructure
Projects
None yet
Development

No branches or pull requests

1 participant