4 Comments

Have you considered queues/events Kafka style? A DAG can at any time raise events (pt 1 is complete!) and downstream DAGs can be triggered by that event. Critically, to ensure decoupling, you don’t want downstream DAGs to be checking conditions (because then they have to know stuff about the upstream DAG), instead, have the upstream DAG issue the event to a queue (“I’ve made 1000 new rows!”) and let the downstream one respond as a subscriber to that message

Expand full comment

Hey Mike, that's a great suggestion, and definitely an improvement over the polling approach. I must admit, kafka isn't my strong suit (been mostly doing batch processing), so I'd be keen to brainstorm the approach with you and implement something during my spare time.

Expand full comment

I love your writing; thanks for sharing! Very insightful and had lots of good images. I wrote about similar things, where I checked trends in orchestration with a declarative approach. In case of interest, I leave the link here: https://airbyte.com/blog/data-orchestration-trends.

Expand full comment

Great post Simon! You've taken the time to experiment with different tools and summarised the industry trends towards declarative-style programming for data orchestration tools. I will be saving this one for sure and referring to it from time to time.

Expand full comment