create-a-derived-table update incremental models guidance #357

tamsinforbes · 2023-07-17T14:14:31Z

To ensure tables are properly cleaned up when either the insert_overwrite or append incremental strategies are defined we need to better understand what the expected behaviour of these strategies is.

I.e., should a partition always be be defined, should there be logic to only clean up tables when the --full-refresh option is used, etc.?

Investigate other related behaviour such as

+ prefix
- If models m1, m2, m3, have the same ancestors does dbt run --select +m1 +m2 +m2 rerun those ancestor models 3 times?
  - No it doesn’t - if the depends list overlaps it only runs a unique set; try dbt run --select +m1 +m1 it still only runs a unique set
- does dbt run --select +m1 include running the seeds that m1 depends on as well as model ancestors.
  - No - seeds must be deployed with the dbt seed command before the models that depend on them are run with dbt run
seeds how to have versioned seeds, eg lookup_offence
- just have a column for version?
  - yes - seed will grow linearly but preserves ease of access for users to previous versions
- or a partition?
  - yes - also partition by release to make performant when users filter by release

Done when

further incrementals training/guidance is added to user guidance

The text was updated successfully, but these errors were encountered:

tamsinforbes added the Data Platform Core Infrastructure This issue is owned by Data Platform Core Infrastructure label Jul 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create-a-derived-table update incremental models guidance #357

create-a-derived-table update incremental models guidance #357

tamsinforbes commented Jul 17, 2023

create-a-derived-table update incremental models guidance #357

create-a-derived-table update incremental models guidance #357

Comments

tamsinforbes commented Jul 17, 2023