PR #53 and PR #55 include the following updates:
- This release supports running the package on multiple Mixpanel sources at once! See the README for details on how to leverage this feature.
- This was achieved through the introduction of new unioning macros.
Please note: This is a Breaking Change in that we have a added a new field,
source_relation
, that points to the source connection from which the record originated. Thissource_relation
field is now part of all generated unique keys.This will require running a full refresh.
- Provided missing column yml documentation.
- Added Quickstart model counts to README. (#56)
- Corrected references to connectors and connections in the README. (#56)
PR #49 includes the following updates:
⚠️ Since the following changes result in the table format changing, we recommend running a--full-refresh
after upgrading to this version to avoid possible incremental failures.
-
For Databricks All-Purpose clusters, incremental models will now be materialized using the delta table format (previously parquet).
- Delta tables are generally more performant than parquet and are also more widely available for Databricks users. This will also prevent compilation issues on customers' managed tables.
-
For Databricks SQL Warehouses, incremental materialization will not be used due to the incompatibility of the
insert_overwrite
strategy.
- The
is_incremental_compatible
macro has been added and will returntrue
if the target warehouse supports our chosen incremental strategy.- This update was applied as there have been other Databricks runtimes discovered (ie. an endpoint and external runtime) which do not support the
insert_overwrite
incremental strategy used.
- This update was applied as there have been other Databricks runtimes discovered (ie. an endpoint and external runtime) which do not support the
- Added integration testing for Databricks SQL Warehouse.
- Added consistency tests for models:
mixpanel__daily_events
mixpanel__event
mixpanel__monthly_events
mixpanel__sessions
- Updated logic for macro
mixpanel_lookback
to align with logic used in similar macros in other packages.
PR #41 includes the following updates:
⚠️ Since the following changes are breaking, a--full-refresh
after upgrading will be required.
-
Added a default 7-day look-back to incremental models to accommodate late arriving events. The number of days can be changed by setting the var
lookback_window
in your dbt_project.yml. See the Lookback Window section of the README for more details.- Note: This replaces the variable
sessionization_trailing_window
, which was previously used in themixpanel__sessions
model. This variable was replaced due to the change in the incremental and lookback strategy.
- Note: This replaces the variable
-
Performance improvements:
- Updated the incremental strategy for of the following models to
insert_overwrite
for BigQuery and Databricks anddelete+insert
for all other supported warehouses.stg_mixpanel__user_event_date_spine
mixpanel__event
mixpanel__daily_events
mixpanel__monthly_events
mixpanel__sessions
- Removed
stg_mixpanel__event_tmp
in favor of ephemeral modelstg_mixpanel__event
. This is to reduce redundancy of models created and reduce the number of full scans. - Updated the materialization of
stg_mixpanel__user_first_event
from a table to a view. This model is used in one downstream model, so a view will reduce storage requirements while not significantly hindering performance. - For Snowflake and BigQuery destinations, added
cluster_by
columns to the configs for incremental models. - For Databricks destinations, updated incremental model file formats to
parquet
for compatibility with theinsert_overwrite
strategy.
- Updated the incremental strategy for of the following models to
- Added column
dbt_run_date
to incremental end models to capture the date a record was added or updated by this package. - Added
_fivetran_id
to themixpanel__event
model, since this is the sourceevent
table's primary key as of the March 2023 connector release notes.
Note: If you run into issues with this update, we suggest to try a full refresh.
- Databricks and Postgres compatibility! (PR #33)
- Updated incremental strategy for the following incremental models (PR #33):
- mixpanel__daily_events
- mixpanel__event
- mixpanel__monthly_events
- mixpanel__sessions
- stg_mixpanel__user_event_date_spine
- Incorporated the new
fivetran_utils.drop_schemas_automation
macro into the end of each Buildkite integration test job. (PR #32) - Updated the pull request templates. (PR #32)
PR #28 includes the following breaking changes:
- Dispatch update for dbt-utils to dbt-core cross-db macros migration. Specifically
{{ dbt_utils.<macro> }}
have been updated to{{ dbt.<macro> }}
for the below macros:any_value
bool_or
cast_bool_to_text
concat
date_trunc
dateadd
datediff
escape_single_quotes
except
hash
intersect
last_day
length
listagg
position
replace
right
safe_cast
split_part
string_literal
type_bigint
type_float
type_int
type_numeric
type_string
type_timestamp
array_append
array_concat
array_construct
- For
current_timestamp
andcurrent_timestamp_in_utc
macros, the dispatch AND the macro names have been updated to the below, respectively:dbt.current_timestamp_backcompat
dbt.current_timestamp_in_utc_backcompat
dbt_utils.surrogate_key
has also been updated todbt_utils.generate_surrogate_key
. Since the method for creating surrogate keys differ, we suggest all users do afull-refresh
for the most accurate data. For more information, please refer to dbt-utils release notes for this update.- Dependencies on
fivetran/fivetran_utils
have been upgraded, previously[">=0.3.0", "<0.4.0"]
now[">=0.4.0", "<0.5.0"]
.
- Updated README documentation for easier navigation and dbt package setup.
- Included the
mixpanel_[source_table_name]_identifier
variables for easier flexibility of the package models to refer to differently named sources tables.
🎉 LISTAGG fix 🎉
- Redshift and Postgres warehouses have a limit to the amount of aggregation that may take place within certain functions. The
mixpanel__sessions
model currently performs a LISTAGG and customers have identified the aggregation sometimes exceeds the limit of the function. Therefore, a conditional was added to check if the target type is Redshift or Postgres. If it is either, it will only perform the aggregation if the count is less than the amount defined by themixpanel__event_frequency_limit
(default 1000) variable. Otherwise, it will return 'Too many event types to render'. (#27)
🎉 dbt v1.0.0 Compatibility 🎉
- Adjusts the
require-dbt-version
to now be within the range [">=1.0.0", "<2.0.0"]. Additionally, the package has been updated for dbt v1.0.0 compatibility. If you are using a dbt version <1.0.0, you will need to upgrade in order to leverage the latest version of the package.- For help upgrading your package, I recommend reviewing this GitHub repo's Release Notes on what changes have been implemented since your last upgrade.
- For help upgrading your dbt project to dbt v1.0.0, I recommend reviewing dbt-labs upgrading to 1.0.0 docs for more details on what changes must be made.
- Upgrades the package dependency to refer to the latest
dbt_fivetran_utils
. The latestdbt_fivetran_utils
package also has a dependency ondbt_utils
[">=0.8.0", "<0.9.0"].- Please note, if you are installing a version of
dbt_utils
in yourpackages.yml
that is not in the range above then you will encounter a package dependency error.
- Please note, if you are installing a version of
Refer to the relevant release notes on the Github repository for specific details for the previous releases. Thank you!