The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “data warehouse design”

Better Data Modeling: #DataVault 2.0 Virtualising your Data Vault – Satellites

Guest Blog Link: Virtualising your Data Vault – Satellites

This is a MUST READ for anyone wanting to get Agile in their DW/BI program and for anyone doing Data Vault 2.0.

Actually anyone doing Data Vault 1.0 can benefit from the technique as well.

Roelant Vos has done quite a bit on trying to virtualize and automate data warehouses using DV 2.0 (especially since the WWDVC in Vermont.). Please check out his blog and follow him on twitter too.

Cheers

Kent

The Oracle Data Warrior

Better Data Modeling: My Top 3 Reasons why you should put Foreign Keys in your Data Warehouse

This question came up at the recent World Wide Data Vault Consortium. Seems there are still many folks who build a data warehouse (or data mart) that do not include FKs in the database.

The usual reason is that it “slows down” load performance.

No surprise there. Been hearing that for years.

And I say one of two things:

1. So what! I need my data to be correct and to come out fast too!

or

2. Show me! How slow is it really?

Keep in mind that while getting the data in quickly is important, so is getting the data out.

Who would you rather have complain – the ETL programmer or the business user trying to run a report?

Yes, it has to be a balance, but you should not immediately dismiss including FKs in your warehouse without considering the options and benefits of those options.

So here are my three main reasons why you should include FK constraints in your Oracle data warehouse database:

  1. The Oracle optimizer uses the constraints to make better decisions on join paths.
  2. Your Data Modeling and BI tools can read the FKs from the data dictionary to create correct joins in the meta data of the tool (SDDM, Erwin, OBIEE, Cognos, Bus Objects can all do this).
  3. It is a good QA check on your ETL. (Yeah, I know… the ETL code is perfect and checks all that stuff, bla, bla, bla)

Now of course there are compromise options. The three main ones are I know:

  1. Drop the constraints at the start of the load then add them back in after the load completes. If any fail to build, that tells you immediately where you may have some data quality problems or your model is wrong (or something else changed).
  2. Build all the constraints as DISABLE NOVALIDATE. This puts them in the database for the BI tools and data modeling tools to see and capture but, since they are not enforced, they put minimal overhead on the load process. And, so I am told by those that know, even a disabled constraint helps the optimizer make a smarter choice on the join path.
  3. (really 2a) Best of both – disable the constraints, load your data, then re-enable the constraints. You get optimization and quality checks.

So NOW what is your reason for not using FKs in your data warehouse?

Happy Modeling!

Kent

Better Data Models: Early Black Friday Data Model Book Sale!

I am truly thankful for the career I have in Oracle, data modeling and data warehouse design.

And I am thankful for you, my loyal readers and followers!

So, in honor of American Thanksgiving and our crazy Black Friday and Cyber Monday insane shopping addiction, I am putting my Kindle ebook on SALE!

Get my book A Check List for Doing Data Model Design Reviews for 34% off starting 8 AM PST Thanksgiving Day (November 29th).

Just go here on Amazon.com.

This is a limited time sale which ends next week on December 5th.

Get a copy for your favorite data modeler!

Happy Thanksgiving and Happy Shopping!

Kent

P.S. For all my overseas friends attending the #UKOUG_Tech13 event next week, you can go here for a similar sale from Sunday December 1st though December 7th.

Post Navigation