Make or Buy Decision: Data Integration for Data Warehouse
In smaller BI teams it’s really important to take care of your core competencies. 💪 So is it really worth the time to code your own data integration for the data warehouse or will you pay for external tools? A classic make or buy decision, which is always custom and sometimes hard to answer.
Major Advantages of data integration tools:
In general I prefer ready-to-execute tools and services as a foundation of a modern BI technology stack, because this vendors are often
- highly specialized in data integration
- „close“ to the data sources (with partnerships etc.)
- so you have reliable data in your data warehouse
- from several sources
- after a few hours or days
- with tiny maintenance effort in the future.
So you will never need any custom code?
In real life, you often have a mix of data integration tools and own coded scripts because you will have a variety of data sources including smaller platforms and partners. Therefore I love Fivetran’s approach of integrating own coded scripts via Google Cloud Functions. In other words they combine the data integration service with the possibility to add custom scripts, that’s a perfect combination from my point of view 😍 (you can find more information in section „Additional Ressources“)
Example for custom Data Integration with Python Code
To give you an detailed impression of one own coded data integration, I’ve published one of my scripts for importing several MongoDB and MySQL databases to Google BigQuery. One cool thing of customization is, that you can sync so many fields, tables and databases in an automated way (with a configuration file) without 1000 tool clicks. I guess programmers love this „copy&paste“ opportunities instead of fancy UIs.
Example Online Tools for Data Integration
The market of tools to support ELT processes is developing and changing fast. There are some more, but find my list of example tools, where I’ve made positive experiences while working with this vendors:
- Build vs. Buy (Fivetran Blogpost):
- Introduction to Google Cloud Functions https://cloud.google.com/functions/
- Google Cloud Functions with Fivetran Integration:
- Erfahrungsbericht ETL vs. ELT (German):