These foundations of communication and trust naturally created a culture that works for us, one where geography does not dictate opportunity. We feel empowered to act independently, making the best decisions for the company, and we check in with each other when we need support.

Often, datasets have built-in assumptions that aren’t quite so obvious looking from the outside in. Datakin includes data quality metrics in the lineage graph so you can see your pipeline health from a distance.

Blog A real-time approach to data lineage Written by Ross Turk on August 5, 2021 A data ecosystem that spans multiple pipelines, teams, and platforms can be overwhelming. Each dataset and job exists in a unique operational context, with interdependencies that may seem simple…until they multiply. Every tiny piece has something in common, though: when […]

Blog How I learned to stop worrying and love lineage Written by Laurent Paris on July 13, 2021 You’re a data engineer, and you’re dreading the coming week. This is the “week of hell”, the one when it’s your turn to be on-call. You will be responsible for any issues that might happen to the […]

Blog Datakin at Berlin Buzzwords 2021 Written by Amanda Bulger on June 30, 2021 Catch Datakin CTO, Julien Le Dem speaking at Berlin Buzzwords on the importance of having a healthy data ecosystem before you can take advantage and get value from your data. Berlin Buzzwords focuses on open source software projects in the field […]

Blog What is data lineage (and why should I care)? Written by Ross Turk on June 22, 2021 Any real-world data architecture is made up primarily of madness and chaos. Your most cared for data pipeline, the one that you spend a lot of time keeping neat, the one that moves your most important data, […]

Blog Watch: Datakin @ the Data & AI Summit 2021 Written by Ross Turk on June 3, 2021 Last week our CTO, Julien Le Dem, took the virtual stage at the 2021 Data & Ai Summit to discuss data lineage with OpenLineage and Apache Spark. If you missed it, fear not! The video has now […]

Blog Data Pipeline Diffing with Datakin Written by Peter Hicks on June 3, 2021 Figure 1: Datakin enables pipeline diffing across runs Envision yourself with a suddenly failing ETL task that you haven’t touched in months. You look at the code and nobody else has touched it, and nothing comes to mind about how this […]