Job runtime is a powerful metric. Datakin lets you to study how runtime changes over time, or how delayed jobs affect the rest of your pipeline.
Job runtime is a powerful metric. Datakin lets you to study how runtime changes over time, or how delayed jobs affect the rest of your pipeline.
dbt is an amazing way to transform data within a data warehouse. So amazing, in fact, that it’s easy to end up doing tons and tons of transformations on all kinds of datasets. After a while, it can become an innavigable collection of overlapping tables. That’s a problem when it comes time to troubleshoot. If […]
Last week, Matt Turck and John Wu published their (mostly) annual report on the state of data, the 2021 Machine Learning, AI and Data (MAD) Landscape. We would like to share some observations of our own.
This initial release of Datakin includes features that will help you get a fresh perspective on your data pipelines, and quickly troubleshoot and repair any issues that might arise.
These foundations of communication and trust naturally created a culture that works for us, one where geography does not dictate opportunity. We feel empowered to act independently, making the best decisions for the company, and we check in with each other when we need support.
Often, datasets have built-in assumptions that aren’t quite so obvious looking from the outside in. Datakin includes data quality metrics in the lineage graph so you can see your pipeline health from a distance.
Often, datasets have built-in assumptions that aren’t quite so obvious looking from the outside in. Datakin includes data quality metrics in the lineage graph so you can see your pipeline health from a distance.
Blog An instant demo of data lineage is worth a thousand words Written by Ross Turk on August 10, 2021 They say that a picture is worth a thousand words. If you’ve ever tried to describe how all the jobs in your data pipeline are interrelated using just words, I am sure it wasn’t easy. […]
Blog A real-time approach to data lineage Written by Ross Turk on August 5, 2021 A data ecosystem that spans multiple pipelines, teams, and platforms can be overwhelming. Each dataset and job exists in a unique operational context, with interdependencies that may seem simple…until they multiply. Every tiny piece has something in common, though: when […]