What is a Reverse ETL?
A reverse ETL is the process of importing or extracting a set of data from one system and populating it in another. While this may sound intimidating, a reverse ETL is actually very straightforward.
To many outside of a data organization, a reverse ETL is the process of taking data from a source and then pushing it out through either dispatch logs or using other methods to make it into a readable document for the end user.
A reverse ETL, sometimes referred to as data preparation, is the mirror image of an ETL. Instead of taking a dataset and doing something with it, like loading it into a Data Warehouse, the goal is to take various data types and join them together into a single table format. This is useful when you have disparate pieces of information that need cleaning up before they can be loaded in a single table format.
ETL vs. Reverse ETL
Understanding reverse ETL begins with understand what ETL is and why it occurs. ETL, short for Extract-Transform-Load is a design pattern used in building data warehouses and information processing systems. ETL's are where the information recorded begins- whether its an action that occurred or a field that has been filled out, which is than transformed into a format that the data warehouse understands, and is then stored there.
Reverse ETL occurs when the information from the data warehouse is taken from the warehouse, translated into a language or format that the source understands (like a CRM or database), and then loaded back into the source, ultimately updating or changing it.
A reverse ETL is the opposite of an ETL, and is used as a development life cycle to test the fitness, quality and readiness of data.
Why would you use Reverse ETL?
From Hightouch, a Reverse ETL tool:
"There are 3 primary use cases for Reverse ETL
- Operational analytics — feeding insights from analytics tools to business teams in their usual workflow so they can make more data-informed decisions
- Data automation — not all data problems are so glamorous. “Can I get a CSV to issue some invoices?”, your finance team asks. Reverse ETL poses a simple solution.
- Data infrastructure — with a growing number of source systems, Reverse ETL is emerging as a general-purpose pattern in software engineering."
What's an example of Reverse ETL?
For example, if a business stores its customer table of data in BigQuery and then needs to run advanced machine learning algorithms on that feed at Amazon Redshift, it will need to perform a process where the data is first transformed from one database to another. This is where ETL comes in: ETL is the process that extracts the data from one database and transforms it into the format supported by another.
Sometimes you want to use your data in different ways than your database was built for. You might want to query your existing data from a data warehouse, or move the data from one database to another. This is where Reverse ETL (Extract, Transform and Load) tools come in handy. These tools offer reusable data transformation ability for your databases, making them more flexible and adaptable.