Oracle Data Warrior

Changing the world, one data model at a time. How can I help you?

This just in: Win Dinner with Monty at #KScope14

Amazing but true – you can now enter a contest to win dinner with ODTUG President Monty Latiolais at ODTUG’s annual conference KScope14.

This year KScope will be held in beautiful Seattle, Washington from June 22nd – 26th.

Who knows what amazing dinner adventure will be in store for the winner!

Get the details here:

Win Dinner with Monty!.

See you in Seattle!

Kent

The Oracle Data Warrior

P.S. I will be presenting again this year and running my now annual Morning Chi Gung class (more on that later)

Better Data Modeling: My Top 3 Reasons why you should put Foreign Keys in your Data Warehouse

This question came up at the recent World Wide Data Vault Consortium. Seems there are still many folks who build a data warehouse (or data mart) that do not include FKs in the database.

The usual reason is that it “slows down” load performance.

No surprise there. Been hearing that for years.

And I say one of two things:

1. So what! I need my data to be correct and to come out fast too!

or

2. Show me! How slow is it really?

Keep in mind that while getting the data in quickly is important, so is getting the data out.

Who would you rather have complain – the ETL programmer or the business user trying to run a report?

Yes, it has to be a balance, but you should not immediately dismiss including FKs in your warehouse without considering the options and benefits of those options.

So here are my three main reasons why you should include FK constraints in your Oracle data warehouse database:

  1. The Oracle optimizer uses the constraints to make better decisions on join paths.
  2. Your Data Modeling and BI tools can read the FKs from the data dictionary to create correct joins in the meta data of the tool (SDDM, Erwin, OBIEE, Cognos, Bus Objects can all do this).
  3. It is a good QA check on your ETL. (Yeah, I know… the ETL code is perfect and checks all that stuff, bla, bla, bla)

Now of course there are compromise options. The three main ones are I know:

  1. Drop the constraints at the start of the load then add them back in after the load completes. If any fail to build, that tells you immediately where you may have some data quality problems or your model is wrong (or something else changed).
  2. Build all the constraints as DISABLE NOVALIDATE. This puts them in the database for the BI tools and data modeling tools to see and capture but, since they are not enforced, they put minimal overhead on the load process. And, so I am told by those that know, even a disabled constraint helps the optimizer make a smarter choice on the join path.
  3. (really 2a) Best of both – disable the constraints, load your data, then re-enable the constraints. You get optimization and quality checks.

So NOW what is your reason for not using FKs in your data warehouse?

Happy Modeling!

Kent

Live from the 1st Annual World Wide Data Vault Consortium: Day 3

Well it was the last day of the 1st annual WWDVC. What an event is has been. (See recap of Day 1 and Day 2, here and here)

The sign outside the meeting room

The sign outside the meeting room

Don’t forget you can see all the action by searching on #WWDVC on twitter.

Agility and Data Vault

Long time data vault advocate Tom Breur opened our closing day with a talk about how we should strive to build the right product and build it right without creating more technical debt in the process.

He said agile is about taking small steps not about being faster. If we do this right, the solution we deliver should generate more legitimate business requirements.

He encouraged us all to read the Theory of Constraints and The Goal (I have), and to learn about lean delivery. Our goal should be to deliver continuously and consistently. The shorter the sprint, the better.

Next he went on to tell us his conclusion that while building a Kimball based solution may appear to deliver more value to the business, it takes too long. And in the end it is a fragile and rigid solution subject to major re-work when requirements change after deployment.

We can deliver value, quicker, using the data vault method. And what we deliver can be done incrementally and more easily added to without re-work.

Tom drew this to show how much quicker we can deliver some value sooner

Tom drew this to show how much quicker we can deliver some value sooner

Data Vault Case Studies

John Sells and Josh Bartells from Data Blueprint shared with us their experiences implementing data vault solutions for their clients.

John and Josh several successful DV projects

John and Josh share several successful DV projects

Why Data Blueprint decided to use DV as their consulting

Why Data Blueprint decided to use DV as their consulting

The gujys from DataBlueprint discuss hurdles and objections they encountered selling DV to clients

The guys from DataBlueprint discuss hurdles and objections they encountered selling DV to clients

It was great to hear about their success stories from the field and see how they addressed the challenges many of us have faced.

Data Vault Modeling Tool

MID GmbH from Germany has a pretty nice modeling tool with built in capabilities to support modeling a data vault solution from stage tables through reporting (including using the new DV icons).

Nice ability to visually show relationships across diagram types

Nice ability to visually show relationships across diagram types

MID's modeler has nice diagram to show all sorts of metadata relationships

MID’s modeler has nice diagram to show all sorts of metadata relationships

This is a tool worth checking out if you are doing a lot of data vault modeling.

Concluding Remarks

Dan closed out this inaugural event with a few remarks, some memories, thanks, and talk about plans for doing this again next year.

Dan closes out the 1st Annual WWDVC and asks about next year - where and when?

Dan closes out the 1st Annual WWDVC and asks about next year – where and when?

Guess I need to be careful what I say! :-)

Guess I need to be careful what I say! :-)

One last time I have to say I am glad I came and can’t wait to do it again next year!

Snowy St Albans

Snowy St Albans

So long for now from Vermont.

Don’t forget to check out LearnDataVault.com and get ready to join us next year!

Kent

Live from the 1st Annual World Wide Data Vault Consortium: Day 2

Wow has is been an amazing event with many amazing people from around the globe. I am VERY glad that I took the time and came to St Albans for this inaugural event.

As with Day 1, there is just too much great information for me to adequately cover it in a blog post (or 10!), so I will give you some highlights and lots of pictures.

The good news is that sometime in the next few months you will be able to purchase access to a video of the entire event on LearnDataVault (then you will be really bummed about not having come in person). Dan and Sanjay figure with their schedules it will take a few months to produce a top quality video. The good news is we had professional videographers for all three days and they filmed every keynote and every talk.

I will let you know when the video is ready.

Dan Does a Deep Dive on Data Vault 2.0

Due to popular demand, Dan actually changed the agenda and made the first session of Day 2 a detailed look at some of the more important aspects of Data Vault 2.0. This is material that he has not really written much about and is only available otherwise in his DV 2.0 bootcamp class.

Here are some of the highlights (again to see even more details check #WWDVC on twitter).

Opposing goals between DW storage and Mart presentation have led to many failed DW/BI projects.

Opposing goals between DW storage and Mart presentation have led to many failed DW/BI projects.

BTW – Dan would like to see us stop calling the reporting side “Data Marts” and start calling them “Information Marts”. To be agile we have to stop mixing the raw data storage (EDW) with the reporting that has applied business rules.

Dan went on a bit of a rant about the need to measure things.

DV 2.0 Methodology helps us be more precise to be more successful

DV 2.0 Methodology helps us be more precise to be more successful

Then we had a discussion about DV 2.0 architecture and methodology and how it supports and fits into the knowledge pyramid.

Dan introduces us to  DV 2.0 and the Knowledge Pyramid

Dan introduces us to DV 2.0 and the Knowledge Pyramid

The DV 2.0 Architecture

The DV 2.0 Architecture

The some discussion about DV 2.0 agility backed up by a real user case study.

A DV success at Qsuper

A DV success at Qsuper

It sure was great to see real numbers on a DV 2.0 success story!

So how did they do it? Dan then showed us his recommended approach to requirements gathering that helps the process become more agile.

DV 2.0 approach to getting better requirements faster

DV 2.0 approach to getting better requirements faster

Then finally another rant about how to get better performance from our systems.

Dan's Rules of Performance

Dan’s Rules of Performance

Roelant Vos (Analytics8) talks about DV Automation

Characteristics of ETL for Data Vault

Characteristics of ETL for Data Vault

Roelant Vos discusses  generating Data Vault ETL from meta data

Roelant Vos discusses generating Data Vault ETL from meta data

Roelant uses Lego as a great analogy for the patterns in Data Vault and why it is possible to auto-create ETL code.

Model driven ETL generation is possible

Model driven ETL generation is possible

What is needed  to support generation of ETL

What is needed to support generation of ETL

With all this in mind, Roelant has built a nice little kit to actually generate DV ETL code for his clients. Nice job! You can follow Roelant on twitter here.

Doug Needham (ClearMeasure) blows our minds with Data Vault Math.

There were a ton of tweets about this session. Doug has done some incredibly innovative work on using mathematics to determine the quality/completeness of a data vault model and some slick ways to validate that it has been loaded correctly. Lots of talk about graph theory and calculating edges ensued.

Doug blows our minds with math

Doug blows our minds with math

Research on how to determine the driving key for a Link

Research on how to determine the driving key for a Link

More calculations from Doug - vector math!

More calculations from Doug – vector math!

This is the first time I have met Doug, a former marine corp DBA. We have found we have a lot of experiences and thoughts in common, including a common client! He even grew up in Texas in the town where I now live!

In this session Doug famously said “If you do not know what a hypothesis, control group or A/B test is, you are NOT a data scientist.”

One of the great things about these event is the people you meet. I am glad to have met Doug.

Using Oracle SQL Developer Data Modeler for DV

After a much needed brain break from Doug’s talk and some networking time, I got to take the floor again to show my favorite FREE data modeling tool. I did my usual top 10 type talk but with a slant towards how I leverage all those features to support a data vault modeling project.

Data Vault Diversity: The former Marine mathematician and the long haired enviromentalist

Data Vault Diversity: The former Marine mathematician and the long haired environmentalist

There were lots of good questions and interaction and lots of interest in how I have built virtual data marts on top of data vault warehouses.

Starting my SDDM intro talk

Starting my SDDM intro talk

And that was it for Day 2. We all took a break then had a happy hour and dinner (with demo) sponsored by AnalytixDS. Good food and good fun!

Data Vault geeks from around the world in snowy St Albans, Vermont

Data Vault geeks from around the world in snowy St Albans, Vermont

More to come on Day 3!

Kent

Live from the 1st Annual World Wide Data Vault Consortium: Day 1

It has been an amazing and historical day in chilly St Albans, Vermont on this first day of Spring.

In a little hotel on the edge of a little town is a gathering of some of the brightest minds in the world of data vault. They have come from around the USA, Canada, Germany, Australia, The Netherlands, and Norway.

Given the location and distance to get here, these are some dedicated folks intent on being the best they can be at their chosen profession.

Like any great user group meeting there was a keynote, user sessions, and lots of networking.

Look at #WWDVC to get the details of all the action for the day. Below I will provide a few pictures to highlight the day.

Dan Linstedt opens the 1st Annual World Wide Data Vault Consortium

Dan Linstedt opens the 1st Annual World Wide Data Vault Consortium

As part of his keynote Dan gave us some highlights from Data Vault 2.0.

DV 2.0 is a complete system for developing an enterprise DW/BI solution

DV 2.0 is a complete system for developing an enterprise DW/BI solution

Part of DV 2.0 is the methodology has officially adopted an agile approach

Part of DV 2.0 is the methodology has officially adopted an agile approach

Of course there were some announcements …

Dan had a lot of partnerships to announce

Dan had a lot of partnerships to announce

Next we heard from Dan’s business partner Sanjay about how  DV 2.0 works with “big data.”

Sanjay talks about using Hadoop and NoSQL with Data Vault 2.0

Sanjay talks about using Hadoop and NoSQL with Data Vault 2.0

In DV 2.0 it is possible to have satellites reside on Hadoop

In DV 2.0 it is possible to have satellites reside on Hadoop

In DV 2.0 using hashed keys allows linking to data in a NoSQL db

In DV 2.0 using hashed keys allows linking to data in a NoSQL db

After a great networking session and a good lunch I gave a 2 hour session on using standards for data modeling (with a few side rants and lots of Q&A).

//

Staring my talk about saving $$$ by using standards

Staring my talk about saving $$$ by using standards

Next we had a great Skype session with Bill Inmon with lots of Q&A.

And finally we had a talk from my friend Rafael about generating data vaults using WhereScape.

Raphael from WhereScape shows us Data Vault automation

Raphael from WhereScape shows us Data Vault automation

Well, that’s it for now. More tomorrow.

Join us on twitter to follow all the action live. (click the links to go right to Day 2 and Day 3 posts)

Kent

Post Navigation

Follow

Get every new post delivered to your Inbox.

Join 731 other followers

%d bloggers like this: