A few weeks back, right after attending Snowflake Summit, I had the pleasure of chatting with Sanjeev Mohan on his podcast – It Depends. We spent an hour or so chatting about trends in modern data management, data mesh, data vault, my now semi-retired life as an advisor to several Snowflake partners, and my love of martial arts.
Check out my latest thoughts about Data Vault and Data Mesh:
Over the last six years (after I joined Snowflake basically), I have witnessed a massive increase in the interest and implementation of Data Vault 2.0. I have talked to literally hundreds of companies across the globe and across all industries about changing their approach to building an enterprise data platform. It was sort of mind boggling how many folks wanted to speak to me about this. So why, after almost two decades of successful data vault implementations, have so many people “suddenly” got interested in Data Vault?
Well, a few reasons:
1. They are moving to the cloud (in this case, Snowflake) and figured it was time to look at their approach to data warehousing and data lakes.
2. What they had been doing for decades, on critical review, really was not working (i.e., lots of expensive re-engineering all the time) and definitely could not scale.
3. Things are changing so rapidly, they needed to find a way to be more agile.
Read the rest of the post here on the Data Rebels site to find out how Data Vault relates to Data Mesh – Data Vault and Data Mesh
As the world is coming out of its pandemic induced isolation, its time we all got back to meetin’ and greetin’ our colleagues and peers in person. One of the things the last two years has taught us it that humans are social animals. We like being around other humans and some variety is good for our mental and emotional health (not that we don’t all love the families that we have been locked up with during this time). Not only that, but we learn and communicate better face to face. So much of our communication is nonverbal and just can’t be seen behind a mask, or a Zoom square.
Are you ready to get back out there, network with your peers, meet some Data Vault Masters, and work on your professional development around Data Vault? If you are then it is time to secure your seat at the 2022 World Wide Data Vault Consortium (WWDVC). As in past years, it will be held near the end of May at the lovely, and family friendly, Stoweflake Lodge in Stowe, Vermont.
By now you surely know that you can build a Data Vault on Snowflake. In fact we have many customers doing so today. So much so that we formed a Snowflake Data Vault User Group.
Over the years I have had hundreds of calls and meetings with organizations around the world discussing this topic from just basic Data Vault 101 type questions to best practices to who is doing Data Vault on Snowflake. Because of that we developed a Data Vault Resource Kit that points you to all the key blog posts, videos, and customer stories on the topic (scroll down to see everything!). Be sure to bookmark that page. Most of your questions on this topic can be answered there.
To take it a step further and to a deeper level, I partnered up with Snowflake Field CTO Dmytro Yarashneko (CDVP2) and wrote a post with reference architectures and discussions related to doing real time feeds into a Data Vault 2.0 on Snowflake. Check that out here. This article even has code!
And, at long last, for those that want to jump in feet first and try it for yourself, the team built a Data Vault Quickstart , based on the above article and a hands on lab from WWDVC 2021, that gives you a step-by-step guide and all the code to build and load a Data Vault 2.0 system, including an information mart on top of the Data Vault, all in your very own Snowflake account.
So, what is your excuse now? You have all the resources you need to give it a go!
And please, bookmark this post and/or the links above so you don’t lose them!
In this day and age, with the ever-increasing availability and volume of data from many types of sources such as IoT, mobile devices, and weblogs, there is a growing need, and yes, demand, to go from batch load processes to streaming or “real-time” (RT) loading of data. Businesses are changing at an alarming rate and are becoming more competitive all the time. Those that can harness the value of their data faster to drive better business outcomes will be the ones to prevail.
One of the benefits of using the Data Vault 2.0 architecture is that it was designed from inception not only to accept data loaded using traditional batch mode (which was the prevailing mode in the early 2000s when Dan introduced Data Vault) but also to easily accept data loading in real or near-realtime (NRT). In the early 2000s, that was a nice-to-have aspect of the approach and meant the methodology was effectively future-proofed from that perspective. Still, few database systems had the capacity to support that kind of requirement. Today, RT or at least NRT loading is almost becoming a mandatory requirement for modern data platforms. Granted, not all loads or use cases need to be NRT, but most forward-thinking organizations need to onboard data for analytics in an NRT manner.
You must be logged in to post a comment.