In this post, I discuss how to engineer your Data Vault load in Snowflake Cloud Data Platform for maximum speed.
Because Snowflake separates compute from storage and allows the definition of multiple independent compute clusters, it provides some truly unique opportunities to configure virtual warehouses to support optimal throughput of DV loads.
Along with using larger “T-shirt size” warehouses to increase throughput, using multi-cluster warehouses during data loading increases concurrency for even faster loads at scale.
Data Vault is an architectural approach that includes a specific data model design pattern and methodology developed specifically to support a modern, agile approach to building an enterprise data warehouse and analytics repository.
Typical Data Vault Design with Hubs, Sats, and a Link
Snowflake Cloud Data Platform was built to be design pattern agnostic. That means you can use it with equal efficiency 3NF models, dimensional (star) schemas, DV, or any hybrid you might have.Snowflake supports DV designs and handles several DV design variations very well with excellent performance.
This series of blog posts will present some tips and recommendations that have evolved over the last few years for implementing a DV-style warehouse in Snowflake.
A few months back I had the privilege of being interviewed by Tobias Macey on his Data Engineering Podcast show. This came about because Tobias actually Tweeted at me about wanting to do the interview! In this episode we spent an hour discussing the ins and outs of the Snowflake Cloud Data Platform. You can find it here. Hope you enjoy it!
How did you get involved in the area of data management?
Can you start by explaining what Snowflake is for anyone who isn’t familiar with it?
How does it compare to the other available platforms for data warehousing?
How does it differ from traditional data warehouses?
How does the performance and flexibility affect the data modeling requirements?
Snowflake is one of the data stores that is enabling the shift from an ETL to an ELT workflow. What are the features that allow for that approach and what are some of the challenges that it introduces?
Can you describe how the platform is architected and some of the ways that it has evolved as it has grown in popularity?
What are some of the current limitations that you are struggling with?
For someone getting started with Snowflake what is involved with loading data into the platform?
What is their workflow for allocating and scaling compute capacity and running analyses?
One of the interesting features enabled by your architecture is data sharing. What are some of the most interesting or unexpected uses of that capability that you have seen?
What are some other features or use cases for Snowflake that are not as well known or publicized which you think users should know about?
When is Snowflake the wrong choice?
What are some of the plans for the future of Snowflake?
This is a great podcast series, so you might want to add it to your regular list!
The Data Warrior & Chief Technical Evangelist at Snowflake
Yup I traveled a lot in 2019. Mostly in my role as Chief Technical Evangelist for Snowflake, but some for family vacations (yes I do take those!).
Inspired by good friend Jeff Smith, here is my travel report based on Google location services. Surprisingly accurate (should I be concerned?).
While traveling this much can be taxing (especially on my family), it is a blessing to be able to see all these wonderful places and meet lots of wonderful people as part of my job at Snowflake, even if it means being a #RoadWarrior.
By The Numbers
174 miles on foot!
There are more miles when I did not have my phone
This includes a great hike with my family all the way from Waikiki Beach to the top of Diamond Head and then back.
2 miles by bike – I know that was in Berlin using an Uber Bike (a very neat and useful concept)
14,450 miles by mass transit (cars, trains)
And then a lot more air miles!
Wow. That adds up to 7 times around the world for an estimated 173,475 miles! Yikes.
In 2019 I got to explore both countries and cities I had never been to before, as well as some fo my old favorites.
In the “never been here before” category:
Stockholm, Sweden – first time ever to Sweden! Even in January it was quite fun with lots of sites to see including the Viking Museum.
Dublin, Ireland – actually got to go there twice!
Lisbon, Portugal – LOVED the seafood!
Australia & New Zealand – I had two separate 2 week trips down under and got to see quite a bit, plus catch up with some old friends I had not seen in over 20 years.
I did manage to also visit places in the US where I have bene before like Dallas, Miami, Ft Lauderdale, Orlando, Santa Monica, Philadelphia, Honolulu (vacation), and Stowe (Vermont). I had LOTS of trips to San Francisco and San Mateo (where Snowflake HQ is located).
Outside the US, I got the pleasure of return visits to Rome (which I LOVE LOVE LOVE), Milan, London (2x), Amsterdam (2x), Utrecht (2x), Helsinki (Finland), and at the end of the year, a trip to snowy Montreal in Canada.
Where to next?
So where will I go in 2020? Stay tuned. The adventure continues…
Happy New Year 2020!
The Data (and Road) Warrior
P.S. One place I will be for sure in June is the Aria Hotel in Las Vegas where we will be hosting the 2nd Annual Snowflake Summit. We do expect it to SELL OUT again this year. Super Early Bird registration closes January 30th, so get your registration in today!
I am happy to announce that The Snowflake Community—a hub for Snowflake users to connect, share, and learn from each other both offline and online—is launching a new program called the Select Star Program.
What is a Select Star?
The Select Star program recognizes and rewards our most engaged community members who go above and beyond to help other users around the globe. The program offers many opportunities, including speaking at meetups or during webinars, publishing technical tutorials and blog posts, answering questions on Stack Overflow, and sharing your own Snowflake story with the community. The more you engage and offer support, the more award points you’ll earn. Once you’ve collected 1,000 points, we’ll officially accept you into the Select Star program.