The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “#datavault”

Data Vault and Data Mesh – A Match Made in the Cloud?

Check out my latest thoughts about Data Vault and Data Mesh:

Over the last six years (after I joined Snowflake basically), I have witnessed a massive increase in the interest and implementation of Data Vault 2.0. I have talked to literally hundreds of companies across the globe and across all industries about changing their approach to building an enterprise data platform. It was sort of mind boggling how many folks wanted to speak to me about this. So why, after almost two decades of successful data vault implementations, have so many people “suddenly” got interested in Data Vault?

Well, a few reasons:

1. They are moving to the cloud (in this case, Snowflake) and figured it was time to look at their approach to data warehousing and data lakes.

2. What they had been doing for decades, on critical review, really was not working (i.e., lots of expensive re-engineering all the time) and definitely could not scale.

3. Things are changing so rapidly, they needed to find a way to be more agile.

Read the rest of the post here on the Data Rebels site to find out how Data Vault relates to Data Mesh – Data Vault and Data Mesh

What do you think?

Have a great week!

Kent

The Data Warrior

WWDVC 2022 is almost here – Are you ready to “up” your Data Vault game?

Vermont is beautiful in May!

As the world is coming out of its pandemic induced isolation, its time we all got back to meetin’ and greetin’ our colleagues and peers in person. One of the things the last two years has taught us it that humans are social animals. We like being around other humans and some variety is good for our mental and emotional health (not that we don’t all love the families that we have been locked up with during this time). Not only that, but we learn and communicate better face to face. So much of our communication is nonverbal and just can’t be seen behind a mask, or a Zoom square.

Are you ready to get back out there, network with your peers, meet some Data Vault Masters, and work on your professional development around Data Vault? If you are then it is time to secure your seat at the 2022 World Wide Data Vault Consortium (WWDVC). As in past years, it will be held near the end of May at the lovely, and family friendly, Stoweflake Lodge in Stowe, Vermont.

Check out all the details, including HOLs, keynotes, and session topics, about WWDVC 2022 in this post by Cindi at DataRebels WWDVC 2022 is almost here

If you are already planning to go, you can just register here.

Keep on modeling!

Kent

The Data Warrior

3 Key Resources for Data Vault on Snowflake

By now you surely know that you can build a Data Vault on Snowflake. In fact we have many customers doing so today. So much so that we formed a Snowflake Data Vault User Group.

Over the years I have had hundreds of calls and meetings with organizations around the world discussing this topic from just basic Data Vault 101 type questions to best practices to who is doing Data Vault on Snowflake. Because of that we developed a Data Vault Resource Kit that points you to all the key blog posts, videos, and customer stories on the topic (scroll down to see everything!). Be sure to bookmark that page. Most of your questions on this topic can be answered there.

To take it a step further and to a deeper level, I partnered up with Snowflake Field CTO Dmytro Yarashneko (CDVP2) and wrote a post with reference architectures and discussions related to doing real time feeds into a Data Vault 2.0 on Snowflake. Check that out here. This article even has code!

And, at long last, for those that want to jump in feet first and try it for yourself, the team built a Data Vault Quickstart , based on the above article and a hands on lab from WWDVC 2021, that gives you a step-by-step guide and all the code to build and load a Data Vault 2.0 system, including an information mart on top of the Data Vault, all in your very own Snowflake account.

So, what is your excuse now? You have all the resources you need to give it a go!

And please, bookmark this post and/or the links above so you don’t lose them!

Model on!

Kent

The Data Warrior

Building a Real-time Data Vault in Snowflake?

Yes you can! The #DataCloud loves #DataVault!

In this day and age, with the ever-increasing availability and volume of data from many types of sources such as IoT, mobile devices, and weblogs, there is a growing need, and yes, demand, to go from batch load processes to streaming or “real-time” (RT) loading of data. Businesses are changing at an alarming rate and are becoming more competitive all the time. Those that can harness the value of their data faster to drive better business outcomes will be the ones to prevail.

One of the benefits of using the Data Vault 2.0 architecture is that it was designed from inception not only to accept data loaded using traditional batch mode (which was the prevailing mode in the early 2000s when Dan introduced Data Vault) but also to easily accept data loading in real or near-realtime (NRT). In the early 2000s, that was a nice-to-have aspect of the approach and meant the methodology was effectively future-proofed from that perspective. Still, few database systems had the capacity to support that kind of requirement. Today, RT or at least NRT loading is almost becoming a mandatory requirement for modern data platforms. Granted, not all loads or use cases need to be NRT, but most forward-thinking organizations need to onboard data for analytics in an NRT manner.

See all the details (and some code) in the full post over on Data Vault Alliance.

Happy Vaulting!

Kent

The Data Warrior

Data Vault 2.0 Automation with erwin and Snowflake

I am seeing a HUGE uptick in interest in Data Vault around the globe. Part of the interest is the need for agility in building a modern data platform. One of the benefits of the Data Vault 2.0 method is the repeatable patterns which lend themselves to automation.  I am please to pass on this great new post with details on how to automate building your Data Vault 2.0 architecture on Snowflake using erwin! Thanks to my buddy John Carter at erwin for taking this project on.

The Data Vault methodology can be applied to almost any data store and populated by almost any ETL or ELT data integration tool. As Snowflake Chief Technical Evangelist Kent Graziano mentions in one of his many blog posts, “DV (Data Vault) was developed specifically to address agility, flexibility, and scalability issues found in the other mainstream data modeling approaches used in the data warehousing space.” In other words, it enables you to build a scalable data warehouse that can incorporate disparate data sources over time. Traditional data warehousing typically requires refactoring to integrate new sources, but when implemented correctly, Data Vault 2.0 requires no refactoring.

Successfully implementing a Data Vault solution requires skilled resources and traditionally entails a lot of manual effort to define the Data Vault pipeline and create ETL (or ELT) code from scratch. The entire process can take months or even years, and it is often riddled with errors, slowing down the data pipeline. Automating design changes and the code to process data movement ensures organizations can accelerate development and deployment in a timely and cost-effective manner, speeding the time to value of the data.

Snowflake’s Data Cloud contains all the necessary components for building, populating, and managing Data Vault 2.0 solutions. erwin’s toolset models, maps, and automates the creation, population, and maintenance of Data Vault solutions on Snowflake. The combination of Snowflake and erwin provides an end-to-end solution for a governed Data Vault with powerful performance.

Get the rest of the details here: Data Vault Automation with erwin and Snowflake

Vault away my friends!

Kent

The Data Warrior

Post Navigation

%d bloggers like this: