The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “data dictionary”

Better Data Modeling: My Top 3 Reasons why you should put Foreign Keys in your Data Warehouse

This question came up at the recent World Wide Data Vault Consortium. Seems there are still many folks who build a data warehouse (or data mart) that do not include FKs in the database.

The usual reason is that it “slows down” load performance.

No surprise there. Been hearing that for years.

And I say one of two things:

1. So what! I need my data to be correct and to come out fast too!

or

2. Show me! How slow is it really?

Keep in mind that while getting the data in quickly is important, so is getting the data out.

Who would you rather have complain – the ETL programmer or the business user trying to run a report?

Yes, it has to be a balance, but you should not immediately dismiss including FKs in your warehouse without considering the options and benefits of those options.

So here are my three main reasons why you should include FK constraints in your Oracle data warehouse database:

  1. The Oracle optimizer uses the constraints to make better decisions on join paths.
  2. Your Data Modeling and BI tools can read the FKs from the data dictionary to create correct joins in the meta data of the tool (SDDM, Erwin, OBIEE, Cognos, Bus Objects can all do this).
  3. It is a good QA check on your ETL. (Yeah, I know… the ETL code is perfect and checks all that stuff, bla, bla, bla)

Now of course there are compromise options. The three main ones are I know:

  1. Drop the constraints at the start of the load then add them back in after the load completes. If any fail to build, that tells you immediately where you may have some data quality problems or your model is wrong (or something else changed).
  2. Build all the constraints as DISABLE NOVALIDATE. This puts them in the database for the BI tools and data modeling tools to see and capture but, since they are not enforced, they put minimal overhead on the load process. And, so I am told by those that know, even a disabled constraint helps the optimizer make a smarter choice on the join path.
  3. (really 2a) Best of both – disable the constraints, load your data, then re-enable the constraints. You get optimization and quality checks.

So NOW what is your reason for not using FKs in your data warehouse?

Happy Modeling!

Kent

Five ways to make Data Modeling Fun

While on my recent family vacation, I happened to mention I needed ideas for a blog post.

My son, all of nine years old, suggested the above title.

Hmmm…I said…not bad. That might work.

After all most people think data modeling booooorrring, right?

But for a few of us, it is kind of fun.

So then I asked him if he had any ideas how we could make it fun.

My son does not actually know how to do any data modeling (yet), but he has looked over my shoulder a few times and knows I draw pictures with boxes and connecting lines and words in the boxes.

With that bit of knowledge, he did come up with a few good ideas that really could make data model review sessions, a bit more fun, and maybe more effective.

Here they are:

Word Search

Put up a large version of a data model on the wall. Give the reviewers a list of words to find on the model diagram (you produce the list from your data dictionary).  Have them go to the diagram to highlight or circle the words one their list.

This will help get everyone familiar with the model and the layout of the diagram.

For more fun – form teams and keep score! Maybe even add a time limit per word.

Silly Sentences

If you don’t know how this works, you start with sentences with blanks in strategic areas. So the sentences may be missing nouns, verbs, adverbs, etc. You have someone fill in the blanks out of context – you ask for a noun but they have no idea what the sentence looks like until after you fill in all the blanks. (This game is in my son’s Nat Geo magazine) It can be quite funny.

One of the hardest parts of a logical model is naming the relationships.  Use this game to figure out the right sentences.

Start by writing the relationships with completely silly or even wrong verbs:

Each Customer must be found squatting at one or more Addresses.

Use your creativity to come up with goofy verbs for the relationships. Then get the users to “validate” the sentences.

I am sure they will be more than willing to correct your errors. 😉

Jeopardy

You all know how this game works – you get the answer and have to come up with the questions.

This is an interesting way to validate your entity and attribute definitions. Use entity definitions as the answers. Users have to guess the entity name.

For example: What is a customer?

Of course it will be really interesting to see if they can link definitions you got from them with the entity names in the model. You might get some clarifications in the process.

Data Model Haiku

You can do this with definitions or maybe relationship sentences. Trying to put the words in a specific form will make you really think about your understanding of the concepts (and force you to be succinct).

Each customer may

Be contacted by one or

More customer reps

Note for my  friends in the UK: Feel free to do Sonnets in Iambic pentameter.

Data Model Telephone

This is pretty much what happens anyway – you attend a meeting with the customer, they give you requirements, you take notes then try to build a model from those notes. You write out definitions and get them to review those. Chances are good you did not get it quite right.

So for fun, and to make a point about recording details carefully, get your team in a room and start at one end whispering a definition to the first person and have them pass it on. Write down the end result to compare to the definition in the model.

If the result is really funny, tell the customer at the next review meeting.

So what do you think? Can we make data modeling more fun?

Let me know your thoughts in the comments below.

If you have any fun ideas, please share those too!

Game on!

Kent

P.S. If you would like some other ideas on how to get better data models, check out my recent Kindle book on best practices for data model design reviews.

Post Navigation