September 16, 2024

Mastering dbt Model Contracts: Definition, Types, & Enforcement

Dexter Chu
Head of Marketing

What Are dbt Model Contracts?

dbt model contracts are a set of model config properties that enforce specific data types, columns, and constraints in a model. They are primarily used to define the shape of a model's API response and ensure that any changes in data do not disrupt the functionality of downstream consumers' data products.

models:
- name: my_model
columns:
- name: column1
tests:
- not_null
- unique

In the above code, a dbt model contract is defined for a model named 'my_model'. It specifies that 'column1' should not contain any null values and should be unique.

  • Keyword: dbt Model Contracts - These are model config properties that enforce data types, columns, and constraints in a model.
  • Keyword: API Response - The output of a model, typically in the form of data, which is sent back to the user after processing their request.
  • Keyword: Downstream Consumers - Entities or processes that use the output of a model as input for their operations.

What Types of Contracts Does dbt Support?

dbt supports three types of contracts: SQL models, Materializations, and Incremental models. Each type serves a unique purpose and is used in different scenarios depending on the requirements of the data team.

models:
- name: my_model
materialized: incremental
columns:
- name: column1
tests:
- not_null
- unique

The code snippet above illustrates an incremental model contract. The 'materialized: incremental' line indicates that the model will only update with new data, rather than rebuilding the entire model each time.

  • Keyword: SQL Models - These are basic dbt models that transform data using SQL.
  • Keyword: Materializations - These are dbt models that specify how dbt should transform data and materialize the results.
  • Keyword: Incremental Models - These are dbt models that only update with new data, rather than rebuilding the entire model each time.

How Does dbt Enforce Model Contracts?

When a model is built with a defined contract, dbt will run a "preflight" check to ensure the model's query returns the correct column names and data types. It will then include these details in the DDL statements sent to the data platform and enforce the constraints while building or updating the model's table.

models:
- name: my_model
columns:
- name: column1
tests:
- not_null
- unique

The code above shows a dbt model contract. When this model is built, dbt will ensure that 'column1' does not contain any null values and is unique. If these conditions are not met, the build will fail.

  • Keyword: Preflight Check - A preliminary check performed by dbt to ensure the model's query returns the correct column names and data types.
  • Keyword: DDL Statements - Data Definition Language statements are used to define or alter database structures.
  • Keyword: Constraints - Rules enforced on data columns in a model.

What Happens When a Contract Does Not Match the Model?

If the column list or data types in a model do not match the contract, the build will fail and highlight the discrepancy. This ensures that the integrity of the data is maintained and any issues are identified early in the process.

models:
- name: my_model
columns:
- name: column1
tests:
- not_null
- unique

In the code above, if 'column1' contains null values or is not unique, the build will fail, and dbt will highlight these issues.

  • Keyword: Column List - The list of columns specified in a dbt model contract.
  • Keyword: Data Types - The type of data, such as integer, string, boolean, etc., that a column in a model can hold.
  • Keyword: Build Failure - An event where the process of building a model is unsuccessful due to discrepancies between the model and its contract.

What Does a Model Contract Include?

A model contract includes the column name, column data type, a rule for no null values, and any additional constraints or validation checks. These contracts are defined in yaml (structured data) and are available in dbt metadata. They can be defined by the same or different people from those writing the model's SQL.

models:
- name: my_model
columns:
- name: column1
data_type: integer
tests:
- not_null
- unique

The code snippet above defines a model contract for 'my_model'. It specifies that 'column1' should be of integer data type, should not contain any null values, and should be unique.

  • Keyword: Column Name - The name of a column in a dbt model.
  • Keyword: No Null Values - A rule in a dbt model contract that specifies a column should not contain any null values.
  • Keyword: Additional Constraints - Additional rules or checks that can be included in a dbt model contract to ensure data integrity.

Keep reading

View all