You don’t need an orchestrator

20 April at 5:30pm UTC

It’s a great time to orchestrate data pipelines. The technological landscape around data orchestrator tools is full of great software, mostly open-source. We are slowly moving from classic scheduling (aka cron) to real orchestration. Basic scheduling involves crontab and other similar programs. Scheduling is about listing when events occur. Orchestration goes a step further by creating triggers between tasks. It’s much more organic. A DAG is a graph structure with only direct links between nodes, without any cycles. The big advantage of such a pattern is the ability to create simple one-direction flows. There is no surprise why all these orchestrators are trending in the data-science field: they fit perfectly with the linear flow of data transformations.