The Data Warrior

Changing the world, one data model at a time. How can I help you?

Get Rich and Happy For Less | How to Be Rich and Happy

<Note: This is a promotional/guest posting from Tim Brownson’s blog How to Be Rich and Happy. I have read this book and loved it (hence the button to order this book in my sidebar). Very straightforward and easy to read with lots of great advice and practical worksheets/exercies. Plus they have a great “change the world” story. For the price offered below, there is really no reason not get it and read it, unless you are already as rich and happy as you want to be. KG>

To celebrate the triple launch of How To Be Rich and Happy in Netherlands, Germany and Taiwan (see photo of the cover) and with Germany and China imminent, we’re slashing the price for a limited period.

The hard copy in English is currently only available on this site and retails for $25. As I’m sure you know all that money goes into printing copies of the book to give to good causes, John and I don’t take a penny out for ourselves.

Until February 14th (I know, we’re such romantics) you can grab it for just $10 plus shipping.

The PDF version sells for $19 and you can grab that for a measly $4, or a tad less than a venti soy latte from your favorite purveyor of fine ground beans.

All you need to do is click on the Buy Now button, choose which version you want and then enter coupon code “justask” at checkout.

Ready for a Rocky Mountain High?

Got the mid winter blues?

Ready for some bright sunshine and fresh mountain air? Maybe some snow?

The wait is almost over! RMOUG Training Days 2012, February 14 – 16, is only 14 days away! That’s right, only a few days until the biggest, and the best regional Oracle users group conference kicks off! But why wait until the last minute?

You still have a few days to take advantage of the lower Standard Registration rates. Register by February 9 to receive the lower rate.

And don’t forget to sign up for a University session. I am participating in this one:

Data Warehouse Performance

Jerry Ireland, Rightsizing Inc., Mark W. Farnham, Rightsizing, Inc., Tim Gorman, Evergreen Database Technologies, Inc, & Kent Graziano, TrueBridge Resources

This session brings together four experts in performance and data warehousing. All are Oracle ACEs, two are members of the OakTable, and together they represent over ninety years of experience with Oracle. All of the ideas and techniques presented can be implemented with minimal additional hardware and most with a manageable effort. The session will start with some hints and tips that can be used on existing data warehouses and progress to a method of getting performance when the CBO does not recognize a star schema. The session will present partitioning strategies to increase scaling, and finally take a glimpse at some of the benefits of relatively new, but growing, database architecture. The day will be wrapped up with a Q & A with all symposium presenters.

So don’t delay, make your plans today!

And maybe get some skiing in?

See ya!

Kent

Data Vault Certification Class – Day 3 (It’a a wrap!)

Well, not so cold today in Montreal. Instead we got very cold rain and snow mix. Yuk. (But I definitely want to come back in the summer!)

This morning, Dan dived into how to load the Data Vault with all new material he has not taught in the certification class before. We really got lucky by attending this class. I knew most of the concepts, and have implemented most of them, but his new slides are just killer. They really get the point across and cleared up a few points for me too.

Not only do the slides include sample SQL for various parts of the load, and the logical patterns, Dan even demonstrated some working code on his really cool, high-powered laptop. It was great for the class to see the Data Vault concepts put into practice. (And he of course had some more tales to tell)

Cool Phrase for the Day

Short Circuit Boolean Evaluation: A mathematical concept, that Dan laid on us, that is used to get very fast results from the Data Vault change detection code . We use it in doing column compares on attribute data to determine if a Satellite row has changes.

In Oracle it looks like this: decode(src_att1, trg_att1,0,1) = 1

In ANSI-SQL it is a bit longer but has the same effect:

CASE WHEN (src_att1 is null and trg_att1 is null or src_att1 = trg_att1))

THEN 1 else 0 = 0

I have been using this for years (learned it from Dan) but had no idea there was a math term for it.

Okay so I am a geek. 🙂

The Test

After all that cool stuff came the certification test.

Not easy. My hand cramped writing out the answers.

We get our results next week. (Dan has a fun weekend ahead of him doing a bunch of scoring).

I am sure everyone in class will do fine. As I said, they all seemed to get it.

Anyway, the class is over now and I am in a hotel in Vermont (where it is snowing now). I fly back to Houston in the morning.

I had a good week here in the northeast (despite the weather). It was definitely worth the time and money to come for this class. I met some great people, learned a lot, and got to spend time with my good friend Dan.

Watch out Montreal – you are about to be descended upon by a whole new batch of Data Vault experts.

It could change the way you do data warehousing.

Later.

Kent

Data Vault Certification – Day 2

Still really cold here in Montreal…

But the classroom discussions heated up a bit today as we dove deeper into the secrets of the Data Vault. Today we got into the nitty-gritty of things like Point in Time (PIT) tables, Bridge tables, DV loading, DV agility, set logic, and designing for performance. We got into some brand new material that has not been in prior certification class. All great stuff.

Dan sure knows a lot about a lot of things (hardware, software, operating systems, disk blocks, cache, etc.). His broad knowledge and experience definitely contributed to what is now Data Vault. We got to hear several juicy stories about real world issues he encountered over his illustrious career that lead him to architect Data Vault to have all the neat features it has (hint: lots of unnamed 3-letter gov’t agencies were apparently involved). Dan is bound and determined to help as many people as possible avoid the many pitfalls he has seen in data warehouse architectures.

What I learned today:

I FINALLY understand the legitimate use for a Bridge table and why it is a good idea (i.e., better query performance getting data out of a DV). The examples in the class got through to me (this time). It is all spelled out in the book, but now I really do get it.

ETL Golden Rule (for performance improvement):

1 process per target per action (i.e., Insert, Update, Delete)

In other words don’t make your ETL too complicated by trying to account for everything in one routine. It becomes too complex to maintain over time and it reduces the amount of parallelization you can do (hence it is SLOWER).

Think about it – new rows get inserted, right? So why waste the cycles forcing them through update logic? In a Data Vault structure it is very easy to separate these out. The SQL is pretty simple actually (but that is another post by itself).

Dan will be teaching more about this in his upcoming online Implementation training course. Stay tuned for that.

Data Vault Data Load Golden Rule:

100% of the data loaded 100% of the time

That really makes the load process pretty easy and streamlined. It is correlated to the Data Vault concept of loading the facts – straight from the source, no modifications. This is why the Data Vault is 100% audit-able.

So for Day 3 we will get even more into how to load a Data Vault.

And then there is the TEST (insert dramatic music here).

We get four hours. 60  questions (half of which are essay!).

Guess I better get studying!

Look for my Day 3 wrap up tomorrow night (assuming I can still write by then).

See ya.

Kent

Data Vault Certification Class – Day 1

As promised, here is your update from the 1st day of the certification class…

First let me say it is COLD here. I forgot how truly bone chilling winter in the northeast could be. Glad I brought extra layers. I walked about 10 blocks from the hotel to the conference center this morning (and back tonight). I was definitely awake when I got there! I am sure that walk alone must have burned off a few hundred calories. 🙂

So the class was fun and educational today. We have 9 people attending. All from the Montreal-area, except of course me (and Dan). Nice group of people; very into it. One gentleman, Pierre, was part of Dan’s online coaching program and has actually already built a successful data vault. It is really nice being with a group of people who “get it”, are engaged, and want to learn more.

Bits and pieces from today:

The goal of certification:

  1. To validate that we actually understand the Data Vault Model and the standards
  2. To validate that we can actually explain it to someone else
  3. To test us to be sure we can actually apply the rules and standards when we develop a real model

Word of the Day:

Deformity: The URGE to continue “slamming” data into an existing conformed dimension until it simply cannot sustain any further changes. This results in a “deformed” dimensions, increased support costs, and likely leads to re-engineering.

Cause: Business saying “But can’t you just add a few more columns to the table? That should be easy right?”

New question to ask:

When you change or re-engineer part of your ETL that scrubs or transforms your source data, do you keep a copy of the original code and structures under some sort of source control? If not, how will you explain to the auditors why the data in the quarterly report changed between runs?

Concept I understand better after today: 

Transaction Links: This is a very special case when you can denormalize  Satellite attributes up into the Link (at a previous job we called these SLINKs). You only do this when the transaction values can NEVER, EVER change. Examples of this are GL entries, and stock trades. Dan’s examples and explanations today really improved my understanding immensely.

Phrase I coined today in the class:

Data Archaeology: Dan uses the analogy of Data Geology (i.e., layers) to explain how (and why) we load data the way we do in the Data Vault. I said that enables us (architects, analysts, users) to effectively do Data Archaeology to find and extract the data we need. We search for those nuggets of wisdom in the data to help our businesses. Sometimes that data is near the surface and sometimes it is fossilized deep in the historic layers within the Data Vault.

No doubt somebody, somewhere, has said this before, but just in case they haven’t, you heard it here first. 😉

We also had great discussions about Agile BI, virtualization of data marts, in-memory dbs, solid state disks, ontologies, and the future of data warehousing in general. And what data warehouse class would be complete without mentioning Bill Inmon and Dr. Ralph Kimball?

Well, that’s it for now.

Stay tuned for Day 2.

Kent

P.S. As I mentioned yesterday, feel free to leave any questions you might have for Dan in the comments and I will pass them on. Or better still, just go buy the Data Vault book.

Post Navigation