Data Vault Certification Class – Day 1
As promised, here is your update from the 1st day of the certification class…
First let me say it is COLD here. I forgot how truly bone chilling winter in the northeast could be. Glad I brought extra layers. I walked about 10 blocks from the hotel to the conference center this morning (and back tonight). I was definitely awake when I got there! I am sure that walk alone must have burned off a few hundred calories. 🙂
So the class was fun and educational today. We have 9 people attending. All from the Montreal-area, except of course me (and Dan). Nice group of people; very into it. One gentleman, Pierre, was part of Dan’s online coaching program and has actually already built a successful data vault. It is really nice being with a group of people who “get it”, are engaged, and want to learn more.
Bits and pieces from today:
The goal of certification:
- To validate that we actually understand the Data Vault Model and the standards
- To validate that we can actually explain it to someone else
- To test us to be sure we can actually apply the rules and standards when we develop a real model
Word of the Day:
Deformity: The URGE to continue “slamming” data into an existing conformed dimension until it simply cannot sustain any further changes. This results in a “deformed” dimensions, increased support costs, and likely leads to re-engineering.
Cause: Business saying “But can’t you just add a few more columns to the table? That should be easy right?”
New question to ask:
When you change or re-engineer part of your ETL that scrubs or transforms your source data, do you keep a copy of the original code and structures under some sort of source control? If not, how will you explain to the auditors why the data in the quarterly report changed between runs?
Concept I understand better after today:
Transaction Links: This is a very special case when you can denormalize Satellite attributes up into the Link (at a previous job we called these SLINKs). You only do this when the transaction values can NEVER, EVER change. Examples of this are GL entries, and stock trades. Dan’s examples and explanations today really improved my understanding immensely.
Phrase I coined today in the class:
Data Archaeology: Dan uses the analogy of Data Geology (i.e., layers) to explain how (and why) we load data the way we do in the Data Vault. I said that enables us (architects, analysts, users) to effectively do Data Archaeology to find and extract the data we need. We search for those nuggets of wisdom in the data to help our businesses. Sometimes that data is near the surface and sometimes it is fossilized deep in the historic layers within the Data Vault.
No doubt somebody, somewhere, has said this before, but just in case they haven’t, you heard it here first. 😉
We also had great discussions about Agile BI, virtualization of data marts, in-memory dbs, solid state disks, ontologies, and the future of data warehousing in general. And what data warehouse class would be complete without mentioning Bill Inmon and Dr. Ralph Kimball?
Well, that’s it for now.
Stay tuned for Day 2.
P.S. As I mentioned yesterday, feel free to leave any questions you might have for Dan in the comments and I will pass them on. Or better still, just go buy the Data Vault book.