The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “#TheDataCloud”

Mass Data Fragmentation: Reducing ‘Data Puddles’

Shortly before leaving Snowflake last year, I was interviewed for this post about one of the worst case examples of data siloes I had seen – we called them data puddles!

A few years ago, Kent Graziano joined a big organization to work on its data. The first problem was that nobody really knew what and where all the data was. Graziano took his first three months on the job investigating data sources and targets, ultimately creating an enterprise data map to illustrate all the flows. It wasn’t pretty.

“In the end, I discovered that the same data was being sent to three or four places,” he said. In one case raw data was transformed and stored in a data warehouse, then moved from there into another warehouse—which was also pulling in the original raw data.

Graziano, who recently retired from his post as Chief Technical Evangelist at Snowflake, said this scenario is entirely common. Data scattered and copied in lakes, warehouses, data marts, SaaS platforms, spreadsheets, test systems, and more. That’s mass data fragmentation, or, more colloquially, data sprawl or data puddles. 

Indeed, 75% of organizations do not have a complete architecture in place to manage an end-to-end set of data activities including integration, access, governance, and protection, according to IDC’s State of the CDO research, December 2021. This lack of governance combines with legacy systems, shadow IT, and good intentions to pave the road to a lot of fragmentation.

Check out the rest of the post to learn how data sprawl hurts businesses and what to do about it. Read it all here!

Try not to step into any of those puddles!

Kent

The Data Warrior

3 Key Resources for Data Vault on Snowflake

By now you surely know that you can build a Data Vault on Snowflake. In fact we have many customers doing so today. So much so that we formed a Snowflake Data Vault User Group.

Over the years I have had hundreds of calls and meetings with organizations around the world discussing this topic from just basic Data Vault 101 type questions to best practices to who is doing Data Vault on Snowflake. Because of that we developed a Data Vault Resource Kit that points you to all the key blog posts, videos, and customer stories on the topic (scroll down to see everything!). Be sure to bookmark that page. Most of your questions on this topic can be answered there.

To take it a step further and to a deeper level, I partnered up with Snowflake Field CTO Dmytro Yarashneko (CDVP2) and wrote a post with reference architectures and discussions related to doing real time feeds into a Data Vault 2.0 on Snowflake. Check that out here. This article even has code!

And, at long last, for those that want to jump in feet first and try it for yourself, the team built a Data Vault Quickstart , based on the above article and a hands on lab from WWDVC 2021, that gives you a step-by-step guide and all the code to build and load a Data Vault 2.0 system, including an information mart on top of the Data Vault, all in your very own Snowflake account.

So, what is your excuse now? You have all the resources you need to give it a go!

And please, bookmark this post and/or the links above so you don’t lose them!

Model on!

Kent

The Data Warrior

Data Mesh Learning – Interview with The Data Warrior

Last week I had the privilege of being interviewed by Nick Heudecker (former Gartner analyst and current Senior Director at Cribl) for the Data Mesh Learning Community. In our interview, we covered the idea of empowering business domains to really own and manage their data via things like templates and a center of excellence, not to just give them the responsibility of owning their data and leaving them to figure the rest out on their own. We also discussed the need for organizations to focus on investing in growing a data culture, not just investing in the newest cloud based tooling. Really, how do we lower the barriers to accessing, sharing, and leveraging data and get people to really think about data-as-a-product.

Like Agile before it, Data Mesh is as much about changing the way an organization thinks and works as it is about technology. I argue that the people and organization aspects of adopting a data mesh approach are more important than the technology aspects. Without the right approach, the best technology (like Snowflake), is not going to solve your organization’s data problems.

See the full interview here:

So what do you think about all this data mesh stuff?

Cheers!

Kent

The Data Warrior

P.S. For much more on the thoughts about #datamesh, check the other podcasts and videos listed on my Snowflake Resources page.

Snowflake Launches Unstructured Data Support in Public Preview

This is great news that many of us have been waiting for! Now we can have all our data in one place.

From Day 1, Snowflake has supported structured and semi-structured data. Snowflake has provided exceptional performance for those data types and has been a pioneer in processing them. Today, Snowflake is adding support for unstructured data to allow customers to deliver more use cases with a single platform. The support for unstructured data management includes built-in capabilities to store, access, process, manage, govern, and share unstructured data in Snowflake. Now you can get all the benefits of the Snowflake Data Cloud with performance, concurrency, and scale for unstructured data.

Read to entire blog for all the details on how you can use this feature.  Unstructured Data Support in Public Preview

This really opens up a lot of new uses cases and makes doing analytics on unstructured data much easier.

Enjoy y’all!

Kent

The Data Warrior

Snowflake Resources from The Data Warrior

Since you are on this blog, I assume that means you are “following” me in that techie, non-stalker sort of way. 😉

That being the case you are aware of my tenure for the last 5+ years at the hottest data-focused software company in the world – Snowflake. And you know I have produced a lot of content (really – A LOT!). Everything from blogs to ebooks to videos to podcasts – in addition to my usual array of industry and Snowflake sponsored webinars and talks.

But if, like me, you have consumed so much content in the last year or so, you have probably lost track of some of it, right?

Well if you are trying to find something you saw me do (or tweet about doing) but just can’t remember what it was (or when or where) and you want to find it but don’t want to sift through the hundreds of Google search results or the thousands of social media posts I have done, look no further!

I decided to make it easier for you (and me frankly) by putting a pretty comprehensive list together on a permanent page here on my site.

I broke it up into videos/Podcasts, Thought Leader articles, ebooks, and blogs and have included links to all of these so you can quickly get to the content you need when you need it.

So check it out and bookmark the page now, before you forget or lose track of this post too. 🙂

You’re welcome.

Kent

The Data Warrior

Post Navigation

%d bloggers like this: