The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the category “Data Vault”

Tips for Optimizing the #DataVault Architecture on #Snowflake (Part 2)

SETTING UP FOR MAXIMAL PARALLEL LOADING!

In this post, I discuss how to engineer your Data Vault load in Snowflake Cloud Data Platform for maximum speed.

Because Snowflake separates compute from storage and allows the definition of multiple independent compute clusters, it provides some truly unique opportunities to configure virtual warehouses to support optimal throughput of DV loads.

Along with using larger “T-shirt size” warehouses to increase throughput, using multi-cluster warehouses during data loading increases concurrency for even faster loads at scale.

Get the details –  Tips for Optimizing the Data Vault Architecture on Snowflake (Part 2)

Enjoy!

Kent

The Data Warrior & Chief Technical Evangelist for Snowflake

Tips for Optimizing the #DataVault Architecture on #Snowflake

Data Vault is an architectural approach that includes a specific data model design pattern and methodology developed specifically to support a modern, agile approach to building an enterprise data warehouse and analytics repository.

Typical Data Vault Design with Hubs, Sats, and a Link

Snowflake Cloud Data Platform was built to be design pattern agnostic. That means you can use it with equal efficiency 3NF models, dimensional (star) schemas, DV, or any hybrid you might have.Snowflake supports DV designs and handles several DV design variations very well with excellent performance.

This series of blog posts will present some tips and recommendations that have evolved over the last few years for implementing a DV-style warehouse in Snowflake.

Here is the first set of tips:Tips for Optimizing the Data Vault Architecture on Snowflake (part 1)

I hope you find this helpful!

Kent

The Data Warrior and Chief Technical Evangelist for Snowflake

Better Data Modeling: Agile Data Engineering

You asked for it, you got it!

Ever since I wrote my Kindle book on Agile Data Engineering and Data Vault 2.0, many, many people have asked me to provide it in a hardcopy format. Well, I finally managed to find time to convert that ebook into a paperback book (I even corrected a few errors in the process).

If you forgot what the book was about, here is the description:

This book will give you a short introduction to Agile Data Engineering for Data Warehousing and Data Vault 2.0. I will explain why you should be trying to become Agile, some of the history and rationale for Data Vault 2.0, and then show you the basics for how to build a data warehouse model using the Data Vault 2.0 standards.In addition, I will cover some details about the Business Data Vault (what it is) and then how to build a virtual Information Mart off your Data Vault and Business Vault using the Data Vault 2.0 architecture.So if you want to start learning about Agile Data Engineering with Data Vault 2.0, this book is for you.

So here it is – Introduction to Agile Data Engineering – now available to purchase on Amazon.

Get your copy now. Next time you see me at an event, I will be happy to sign it for you. 🙂

Enjoy!

Kent

The Data Warrior

Schema-on-what? How to model JSON

It seems hard to believe, but all year, around the world, I continue to have this conversation on whether or not we still need data modeling.

I know! Crazy!

Thought we were past that…

As I have said before,

Schema-on-read has the word SCHEMA in it!

So instead of continuing to rant about it, I decided to put together a talk to show people, graphically, what I meant by decomposing, step by step, a few JSON documents into real data models. For the sake of the talk I decided to go with 3NF and Data Vault styles to make my point.

This talk has been very well received so I decided I would share it a bit more publicly by posting it here on my blog.

 

Now that you can see how to model JSON, check out my Snowflake ebook on how to easily analyze JSON using SQL.

If you know any meet-ups or conferences that I should be giving this talk at, please let me know. Or check out my speaking schedule for 2019 and join me at one of the events already on my calendar. (1st up is ITOUG in Milano!)

Ciao!

Kent

The Data Warrior & Chief Evangelist at Snowflake

P.S. There was no magic, or built-in wizard, to creating the models. I did it all by hand using Oracle Sql Developer Data Modeler.

 

Get Certified! #DataVault 2.0 Certification in the US

Quick update – if you have been waiting to get your Data Vault 2.0 certification there are three sessions coming in the new few months right in the USA.  If you already know you want to do that, just skip down to the links and sign up!

Why Data Vault?

The Data Vault 2.0 architecture gives you an entire systems based approach to developing a true enterprise data warehouse and analytics architecture. It is very structured, pattern based, and highly repeatable. In Data Vault, each component does it’s duty, and does it well. The engineering components are generally relegated to automation tools (because it is pattern based), so human effort is not wasted in the mundane and can be used in more interesting, intelligent and thinking tasks. It’s a much better use of intelligent beings as well as machines.

Separating the concerns makes design and development not just easy, but fast.

As a side effect projects using Data Vault 2.0 have always saved a lot of money and have been extremely successful with their predictable goals. Plus they are very resilient so they tend to stay in use for years to come with little or no re-engineering! One of my systems has been running for 14 years now – and was even successfully re-platformed in that time.

How do you get in on this innovative approach?

If you want to learn more (and why wouldn’t you?), there are many upcoming opportunities across the world to get more information about Data Vault 2.0 (just check Twitter or LinkedIn – look for #DataVault). If you understand it, and you want to use it to leverage your own successes, you can even get certified (That comes with a responsibility though).

Here’s a list of upcoming opportunities to get DV 2.0 certified in the US:

1. Sep 19-21, Chicago, IL – http://www.performanceg2.com/agile-bi-datavault-training/

2. Oct 2-4, New York City, NY – http://www.scalefree.com/2017/03/30/data-vault-2-0-boot-camp-and-certification-new-york-oct-2017/

3. Nov 27-29, Santa Clara, CA – http://www.scalefree.com/2017/03/29/data-vault-2-0-boot-camp-and-certification-santaclara-nov-2017/

Ready to challenge the status quo and become a data champion at your organization? Then sign up for one of these classes today!

Model on!

Kent

The Data Warrior

Post Navigation

%d bloggers like this: