The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “#datavault”

Data Vault Certification Class – Day 1

As promised, here is your update from the 1st day of the certification class…

First let me say it is COLD here. I forgot how truly bone chilling winter in the northeast could be. Glad I brought extra layers. I walked about 10 blocks from the hotel to the conference center this morning (and back tonight). I was definitely awake when I got there! I am sure that walk alone must have burned off a few hundred calories. 🙂

So the class was fun and educational today. We have 9 people attending. All from the Montreal-area, except of course me (and Dan). Nice group of people; very into it. One gentleman, Pierre, was part of Dan’s online coaching program and has actually already built a successful data vault. It is really nice being with a group of people who “get it”, are engaged, and want to learn more.

Bits and pieces from today:

The goal of certification:

  1. To validate that we actually understand the Data Vault Model and the standards
  2. To validate that we can actually explain it to someone else
  3. To test us to be sure we can actually apply the rules and standards when we develop a real model

Word of the Day:

Deformity: The URGE to continue “slamming” data into an existing conformed dimension until it simply cannot sustain any further changes. This results in a “deformed” dimensions, increased support costs, and likely leads to re-engineering.

Cause: Business saying “But can’t you just add a few more columns to the table? That should be easy right?”

New question to ask:

When you change or re-engineer part of your ETL that scrubs or transforms your source data, do you keep a copy of the original code and structures under some sort of source control? If not, how will you explain to the auditors why the data in the quarterly report changed between runs?

Concept I understand better after today: 

Transaction Links: This is a very special case when you can denormalize  Satellite attributes up into the Link (at a previous job we called these SLINKs). You only do this when the transaction values can NEVER, EVER change. Examples of this are GL entries, and stock trades. Dan’s examples and explanations today really improved my understanding immensely.

Phrase I coined today in the class:

Data Archaeology: Dan uses the analogy of Data Geology (i.e., layers) to explain how (and why) we load data the way we do in the Data Vault. I said that enables us (architects, analysts, users) to effectively do Data Archaeology to find and extract the data we need. We search for those nuggets of wisdom in the data to help our businesses. Sometimes that data is near the surface and sometimes it is fossilized deep in the historic layers within the Data Vault.

No doubt somebody, somewhere, has said this before, but just in case they haven’t, you heard it here first. 😉

We also had great discussions about Agile BI, virtualization of data marts, in-memory dbs, solid state disks, ontologies, and the future of data warehousing in general. And what data warehouse class would be complete without mentioning Bill Inmon and Dr. Ralph Kimball?

Well, that’s it for now.

Stay tuned for Day 2.

Kent

P.S. As I mentioned yesterday, feel free to leave any questions you might have for Dan in the comments and I will pass them on. Or better still, just go buy the Data Vault book.

2012 – Year of the Data Vault?

Well, several of us sure hope so! 😉

Data Vault Modeling appears to finally be catching on in the DW community here in the USA and in Canada. My co-author (and co-conspirator) Dan Linstedt provided some details on organizations and consultants who have been successful using Data Vault in 2011. We got some great quotes from a few industry luminaries to boot! You can see the details in his year-end blog post.

We (Dan & I) spent a fair amount of time (i.e., years) and effort trying to get to this point. One big highlight (for me anyway) was finally seeing the technical book on data vault data modeling get published. We started writing parts of the book over five years ago, but things like making the mortgage payment kind of got in the way. Writing and self-publishing a book can be pretty intense so it is really nice to see it in print. You can get your very own hard-copy on Amazon, or an e-copy (with some cool bonuses) from Dan’s Learn Data Vault site. (Full disclosure – if you go buy the book from either site, I will of course make huge piles of $$$$ in royalties that might let me spring for lunch occasionally.)

If you are not really sure what Data Vault is all about and want the short course first, check out my Introduction to Data Vault slides from my presentation at Oracle Open World 2011.  I had about 30 people attend the session and had some great discussions with the attendees. That was pretty gratifying since it was the LAST session on the LAST day of the conference.

So 2011 was a very successful year for getting the word out that there is a better way to develop your enterprise data warehouse.

Here’s to continuing the momentum in 2012.

Later.

Kent

Post Navigation