Get started with Secoda
See why hundreds of industry leaders trust Secoda to unlock their data's full potential.
See why hundreds of industry leaders trust Secoda to unlock their data's full potential.
The freshness of our data gives us an edge in making informed decisions. And since every second counts in the world of data, it's crucial to keep our data up-to-date. In this tutorial, we'll be learning how to use the dbt source freshness command to ensure our source tables are updated within a specified time frame.
dbt source freshness is a command in the data build tool (dbt) that checks if source tables have been updated within a specified time frame. It's a subcommand of the dbt source command, which provides subcommands for working with source data. The command compares the timestamp column in your source to a time cadence you set, such as a minute, hour, or day. If the data is older than the time the test is run minus the cadence, the test will either fail or warn you.
To use the dbt source freshness command effectively, follow these steps:
First, you need to define your sources in a YAML file. This file should include the source tables you want to check for freshness. For example:
sources:
- name: my_source
tables:
- name: my_table
loaded_at_field: _etl_loaded_at
In this example, we've defined a source named "my_source" with a table named "my_table" and specified the loaded_at_field as "_etl_loaded_at".
Next, you need to set the freshness criteria for your source tables. This is done by defining a "freshness" block in your sources YAML file. For example:
sources:
- name: my_source
tables:
- name: my_table
loaded_at_field: _etl_loaded_at
freshness:
warn_after: {count: 3, period: day}
error_after: {count: 5, period: day}
In this example, we've set a warning threshold of 3 days and an error threshold of 5 days. If the data is older than 3 days, a warning will be issued, and if it's older than 5 days, an error will be raised.
Once you've defined your sources and set the freshness criteria, you can run the dbt source freshness command:
dbt source freshness
This command will compare the timestamp column in your source to the time cadence you set and either pass, warn, or fail based on the freshness criteria.
While using dbt source freshness, you may encounter some common challenges:
To make the most of dbt source freshness, follow these best practices: