Make or Buy Decision: Data Integration for Data Warehouse

In smaller BI teams it’s really important to take care of your core competencies. ūüí™ So is it really worth the time to code your own data integration for the data warehouse or will you pay for external tools? A classic make or buy decision, which is always custom and sometimes hard to answer.

Make or Buy Decision: Data Integration for Data Warehouse

Major Advantages of data integration tools:

In general I prefer ready-to-execute tools and services as a foundation of a modern BI technology stack, because this vendors are often

  • highly specialized in data integration
  • „close“ to the data sources (with partnerships etc.)
  • so you have reliable data in your data warehouse
  • from several sources
  • after a few hours or days
  • with tiny maintenance effort in the future.

So you will never need any custom code?

In real life, you often have a mix of data integration tools and own coded scripts because you will have a variety of data sources including smaller platforms and partners. Therefore I love Fivetran’s approach of integrating own coded scripts via Google Cloud Functions. In other words they combine the data integration service with the possibility to add custom scripts, that’s a perfect combination from my point of view ūüėć (you can find more information in section „Additional Ressources“)

Example for custom Data Integration with Python Code

To give you an detailed impression of one own coded data integration, I’ve published one of my scripts for importing several MongoDB and MySQL databases to Google BigQuery. One cool thing of customization is, that you can sync so many fields, tables and databases in an automated way (with a configuration file) without 1000 tool clicks. I guess programmers love this „copy&paste“ opportunities instead of fancy UIs.

https://github.com/mbosold/mysql_and_mongodb_importer_for_big_query

Example configuration of selected database tables for data integration load job
Example of configuration file for selecting the database tables for data load job

Example Online Tools for Data Integration

The market of tools to support ELT processes is developing and changing fast. There are some more, but find my list of example tools, where I’ve made positive experiences while working with this vendors:

Additional Ressources