Joining the Astronomer team Written by Laurent Paris

March 22, 2022

Datakin is very pleased to announce that we have been acquired by Astronomer, the commercial developer of Apache Airflow.

This is both a beginning and an end for us. It is a happy conclusion to the story of Datakin, whose team is now a part of Astronomer, and a celebratory moment for all of us. For Julien and me, who were first-time founders, the move brings a feeling of achievement and a shared sense of excitement and urgency about a new beginning. We may be concluding the story of Datakin, but we are also writing the next chapter for lineage — and our team.

When we started Datakin two short years ago, we had an ambitious goal: to bring data lineage to the modern data stack, systematically reimagining each layer to take full advantage of its new context. We believe that lineage can lead to a new level of data sophistication — more complete awareness, greater agility, stronger governance — and that the time for lineage is now.

Our mission has not changed. Our new friends at Astronomer share it with us, and we will go further and faster together.

About Astronomer

The Airflow orchestration platform, first developed at Airbnb, has become ubiquitous. It’s a powerful open source orchestration system that makes it possible for anyone to run a sophisticated data pipeline. Once a company has reached a certain size or complexity, chances are good they’re using Airflow. And if they’re using Airflow, chances are great they’d benefit from knowing the folks at Astronomer.

Astronomer, founded in 2018, offers products and services that help customers get the most out of Airflow. The team has quickly become the driving force behind Airflow, increasingly delivering value to customers and demonstrating stewardship in the open source community. Its modern data orchestration platform, Astro, is in use across the industry. Trusted by organizations such as Credit Suisse, Condé Nast, Electronic Arts, and Rappi, Astronomer is a clear leader in the data orchestration space.

Why Astronomer?

The democratization of data orchestration, sparked in part by Airflow, has had a huge effect on organizations. As data engineering and analysis have become more and more distributed, they bring increased speed and flexibility. But, as we have seen in other fields, democratization can also lead to fragmentation. When datasets are created in every department, black boxes start to proliferate. And when it comes to ensuring compliance, implementing governance, protecting quality, and improving efficiency, black boxes are not a good thing — they obscure understanding and threaten our newly won efficiency.

Fragmentation is inevitable, but it doesn’t have to lead to a world of data teams cut off from one another. With lineage, organizations can observe and contextualize disparate pipelines, stitching everything together into a single navigable map. They can permit (heck, encourage!) long-tail data innovation while maintaining the provenance and quality their businesses require. Lineage, we believe, is the difference between chaotic fragmentation and properly-managed decentralization.

There are a lot of ways to collect lineage metadata, and there are a lot of products in the market. But most of them are designed primarily to collect it from the query logs of data warehouses. We believe the best place to gather lineage is in the data orchestration layer where datasets are made, not in the warehouse where they’re stored. Astronomer lives where datasets are made, right at lineage ground zero, and they have the perspective and experience to help us bring the power of lineage to the modern data stack.

What’s going to happen now?

Datakin’s innovative technology stack, including integrations with ecosystem tools such as dbt, Great Expectations, Apache Spark, and Apache Airflow, will not go away. 

Every single line of code we wrote has a future at Astronomer, including our contributions to the Marquez and OpenLineage open source projects. In fact, we will be able to dedicate more resources to this important work. Datakin’s active development will not end, either. We will continue to enhance our operational data lineage solution, though you may come to recognize it under a different name soon.

Finally, we will continue to support our users. There won’t be a shutdown of any Datakin customer infrastructure until we have developed a migration path, and we will continue to answer questions via Slack and our website chat bot.

In fact, we’re still accepting signups! It’s not too late to get your (now vintage!) Datakin instance.