September 16, 2024

What Is Snowflake Flatten?

A guide on using the FLATTEN function in Snowflake to handle nested data structures like JSON.
Dexter Chu
Head of Marketing

The FLATTEN function in Snowflake is designed to transform semi-structured data into a relational format. It is particularly useful when dealing with VARIANT, OBJECT, or ARRAY columns, converting hierarchical data structures into multiple rows for easier querying and analysis.

FLATTEN (
INPUT => <expr> [ , PATH => <constant_expr> ] [ , OUTER => TRUE | FALSE ] [ , RECURSIVE => TRUE | FALSE ] [ , MODE => 'OBJECT' | 'ARRAY' | 'BOTH' ]
)

This code snippet shows the syntax for the FLATTEN function. The INPUT parameter is required and must be of type VARIANT, OBJECT, or ARRAY. Optional parameters include PATH, OUTER, RECURSIVE, and MODE, which provide additional control over the flattening process.

How to Use the FLATTEN Function?

The FLATTEN function can be used in various scenarios depending on the structure of the semi-structured data and the desired output. Below are some examples demonstrating its use.

  • Simple Flattening: This example demonstrates a basic use of the FLATTEN function to transform an array into multiple rows. SELECT * FROM TABLE (FLATTEN(input => parse_json('[1, 2, 77]'))) f; The output will show each element of the array as a separate row.
  • Using the PATH Parameter: This example shows how to use the PATH parameter to specify a path within the VARIANT data to be flattened. SELECT * FROM TABLE (FLATTEN(input => parse_json('{"a":1, "b":[77,88]}'), path => 'b')) f; The output will show the elements of the array located at the specified path.
  • Using the OUTER Parameter: This example demonstrates the use of the OUTER parameter to include rows that cannot be expanded. SELECT * FROM TABLE (FLATTEN(input => parse_json('[]'), outer => true)) f; The output will generate one row with NULL values for zero-row expansions.
  • Using the RECURSIVE Parameter: This example shows how to use the RECURSIVE parameter to expand all sub-elements recursively. SELECT * FROM TABLE (FLATTEN(input => parse_json('{"a":1, "b":[77,88], "c": {"d":"X"}}'))) f; The output will show all sub-elements of the hierarchical structure.

Tutorial: How to Use the FLATTEN Function in Snowflake

1. Simple Flattening

This step demonstrates a basic use of the FLATTEN function to transform an array into multiple rows.

SELECT * FROM TABLE (FLATTEN(input => parse_json('[1, 2, 77]'))) f;

The output will show each element of the array as a separate row, making it easier to query and analyze.

2. Using the PATH Parameter

This step shows how to use the PATH parameter to specify a path within the VARIANT data to be flattened.

SELECT * FROM TABLE (FLATTEN(input => parse_json('{"a":1, "b":[77,88]}'), path => 'b')) f;

The output will show the elements of the array located at the specified path, allowing for more targeted data extraction.

3. Using the OUTER Parameter

This step demonstrates the use of the OUTER parameter to include rows that cannot be expanded.

SELECT * FROM TABLE (FLATTEN(input => parse_json('[]'), outer => true)) f;

The output will generate one row with NULL values for zero-row expansions, ensuring that no data is lost.

4. Using the RECURSIVE Parameter

This step shows how to use the RECURSIVE parameter to expand all sub-elements recursively.

SELECT * FROM TABLE (FLATTEN(input => parse_json('{"a":1, "b":[77,88], "c": {"d":"X"}}'))) f;

The output will show all sub-elements of the hierarchical structure, providing a comprehensive view of the data.

Common Challenges and Solutions

Some common challenges or errors you might encounter using FLATTEN include:

  • Handling NULL Values: When using the OUTER parameter, ensure that your queries can handle NULL values appropriately to avoid unexpected results.
  • Performance Issues: For large datasets, consider indexing and optimizing your queries to improve performance when using the FLATTEN function.
  • Complex Data Structures: When dealing with deeply nested structures, use the RECURSIVE parameter to ensure all sub-elements are expanded.

Recapping Snowflake FLATTEN Function

  • The FLATTEN function is essential for transforming semi-structured data into a relational format, making it easier to query and analyze.
  • Various parameters like PATH, OUTER, and RECURSIVE provide flexibility in how data is flattened, allowing for targeted and comprehensive data extraction.
  • Understanding and using the FLATTEN function can significantly enhance data analysis capabilities, especially when dealing with complex data structures.

Keep reading

View all