A few weeks back, right after attending Snowflake Summit, I had the pleasure of chatting with Sanjeev Mohan on his podcast – It Depends. We spent an hour or so chatting about trends in modern data management, data mesh, data vault, my now semi-retired life as an advisor to several Snowflake partners, and my love of martial arts.
Check out my latest thoughts about Data Vault and Data Mesh:
Over the last six years (after I joined Snowflake basically), I have witnessed a massive increase in the interest and implementation of Data Vault 2.0. I have talked to literally hundreds of companies across the globe and across all industries about changing their approach to building an enterprise data platform. It was sort of mind boggling how many folks wanted to speak to me about this. So why, after almost two decades of successful data vault implementations, have so many people “suddenly” got interested in Data Vault?
Well, a few reasons:
1. They are moving to the cloud (in this case, Snowflake) and figured it was time to look at their approach to data warehousing and data lakes.
2. What they had been doing for decades, on critical review, really was not working (i.e., lots of expensive re-engineering all the time) and definitely could not scale.
3. Things are changing so rapidly, they needed to find a way to be more agile.
Read the rest of the post here on the Data Rebels site to find out how Data Vault relates to Data Mesh – Data Vault and Data Mesh
Back in the saddle again for The Data Warrior! Here is a piece I just did for the folks at Wherescape about one of my favorite topics – Automation!
The world of data has changed for sure. Especially over the past several years. In fact, the pandemic accelerated some changes, like the migration to cloud-based data platforms. When everyone needed to be remote, it just made sense to move to the cloud and use a service for your data platform.
Along with that came more data, more data types, and an actual business needs to move faster. Companies had to adapt very quickly during the pandemic if they wanted to survive. Many did and thrived while others, well, not so much.
As the demand for data continues to grow at unprecedented rates, and as it becomes a non-negotiable asset for organizational success, the requirement to rapidly deliver value from that data (i.e., turn it into information for data-driven decision making) has become an imperative.
So how do we deliver value faster with our data warehouses, data meshes, and enterprise data hubs? Automate, automate, automate.
Shortly before leaving Snowflake last year, I was interviewed for this post about one of the worst case examples of data siloes I had seen – we called them data puddles!
A few years ago, Kent Graziano joined a big organization to work on its data. The first problem was that nobody really knew what and where all the data was. Graziano took his first three months on the job investigating data sources and targets, ultimately creating an enterprise data map to illustrate all the flows. It wasn’t pretty.
“In the end, I discovered that the same data was being sent to three or four places,” he said. In one case raw data was transformed and stored in a data warehouse, then moved from there into another warehouse—which was also pulling in the original raw data.
Graziano, who recently retired from his post as Chief Technical Evangelist at Snowflake, said this scenario is entirely common. Data scattered and copied in lakes, warehouses, data marts, SaaS platforms, spreadsheets, test systems, and more. That’s mass data fragmentation, or, more colloquially, data sprawl or data puddles.
Indeed, 75% of organizations do not have a complete architecture in place to manage an end-to-end set of data activities including integration, access, governance, and protection, according to IDC’s State of the CDO research, December 2021. This lack of governance combines with legacy systems, shadow IT, and good intentions to pave the road to a lot of fragmentation.
Check out the rest of the post to learn how data sprawl hurts businesses and what to do about it. Read it all here!
What a great event! So many announcements and great demos, plus and awesome live Q&A with our Snowflake leaders.
At Snowday 2021, Snowflake announced exciting new product capabilities that expand what is possible in the Data Cloud. In addition to announcing Python support in Snowpark (currently in private preview), these latest innovations make it easier for organizations to maintain business continuity across clouds and regions; help data engineers and data scientists build pipelines, ML workflows, and data applications faster; and remove the complexity of getting the right data into the hands of customers.
The Snowflake Data Cloud is a global network connecting organizations through data, creating new opportunities for collaboration to improve business outcomes, and fundamentally changing what is possible across industries. For Kraft Heinz, its data science teams are able to build and test models dramatically faster in Snowflake compared with its prior data lake. For NBCUniversal, it’s building brand-new advertising targeting and measurement products, in a secure and privacy-compliant way using Snowflake’s governance and data sharing capabilities. And for 84.51°, it’s built a Collaborative Cloud that takes complexity off the table and unlocks new possibilities for grocers and CPGs sharing and collaborating on data.
Snowflake continues to expand the scope and possibilities of the Data Cloud, delivering unique innovations that enable customers to:
You must be logged in to post a comment.