The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “FKs”

Better Data Modeling: Discovering Foreign Keys (FK) in #SQLDevModeler (SDDM)

A while back I had an interesting situation when I was attempting to reverse engineer and document a database for a client.

I had a 3rd party database that had PKs defined on every table but no FKs in the database. The question I posed (on the Data Modeler Forum) was:

How do I get the FK Discover utility to find FK columns with this type of pattern:

Parent PK column = TABCUSTNUM

Child FK column = ABCCUSTNUM

So the root column name (CUSTNUM) is standard but in every table the column name has a different 3 character “prefix” that is effectively the table short name. Is there way to get the utility to ignore the first three characters of the column names?

This was in SDDM 4.1.873.

No easy answer.

Well, the ever helpful Philip was very kind and wrote me a slick custom Transformation Script that did the trick! (Check the post if you want to see the code.)

But wait there’s more!

In his response he mentioned a feature coming in 4.1.888 – the ability to include a table prefix as part of a FK column naming template (just like this app had done).

Cool, I thought, but how does that help?

Well with the template in place it turns out that you can have the FK Discovery utility search based on the Naming Template model rather than just look for exact matching column names.

Using the Custom Naming Template

So recently (today in fact) I was trying to add FKs to the Snowflake DB model I reverse engineered a few weeks back (Jeff pointed out they were missing). I noticed the model had that pattern of a prefix on both the FK and PK column names.

In the CUSTOMER table the PK is C_CUSTKEY. In the ORDER table it is O_CUSTKEY. Nice simple pattern (see the diagram below for more). That reminded me of the previous issue and Philip’s script.

Snowflake Schema

Off to OTN to find that discussion and refresh my memory.

In the post, Philip has posted an example of a template that fit my previous problem:

{table abbr}SUBSTR(4,30,FRONT,{ref column})

With the note that {table abbr} would be the equivalent of what I called the table prefix. So first I went to the table properties and put in the prefixes using the Abbreviation property:

Add Table Abbrev

Then all I had to do was modify his example to account for the underscore and the fact that the main column text would start at character #3 instead of #4:

{table abbr}_SUBSTR(3,30,FRONT,{ref column})

I input that by going to Properties -> Settings -> Naming Standards -> Templates and then editing the default template:

Set up FK template

Discover FKs!

Now it was just a matter of running the utility. You find that with a right mouse click on the Relational Design node:

Discover FK tool

Next you get the list of candidate FKs:

Create FKs

Note that the utility also suggested some FKs based on the unique constraints (UKs) as well. I did not want those, so I unchecked them before I hit “OK”.

The result was getting all the FKs I wanted added into my model! Viola!

Snowflake with RI

So I can happily report that Philip’s little enhancement works just fine in 4.1.3. WooHoo! I can see this being very useful for a lots of cases in the future.

In a future post (early next year), I will continue with showing how we implemented Referential Integrity constraints in Snowflake DB and if I can generate the DDL from #SQLDevModeler.

Happy New Year Y’all

Kent

The Data Warrior & Snowflake Technical Evangelist

Better Data Modeling: My Top 3 Reasons why you should put Foreign Keys in your Data Warehouse

This question came up at the recent World Wide Data Vault Consortium. Seems there are still many folks who build a data warehouse (or data mart) that do not include FKs in the database.

The usual reason is that it “slows down” load performance.

No surprise there. Been hearing that for years.

And I say one of two things:

1. So what! I need my data to be correct and to come out fast too!

or

2. Show me! How slow is it really?

Keep in mind that while getting the data in quickly is important, so is getting the data out.

Who would you rather have complain – the ETL programmer or the business user trying to run a report?

Yes, it has to be a balance, but you should not immediately dismiss including FKs in your warehouse without considering the options and benefits of those options.

So here are my three main reasons why you should include FK constraints in your Oracle data warehouse database:

  1. The Oracle optimizer uses the constraints to make better decisions on join paths.
  2. Your Data Modeling and BI tools can read the FKs from the data dictionary to create correct joins in the meta data of the tool (SDDM, Erwin, OBIEE, Cognos, Bus Objects can all do this).
  3. It is a good QA check on your ETL. (Yeah, I know… the ETL code is perfect and checks all that stuff, bla, bla, bla)

Now of course there are compromise options. The three main ones are I know:

  1. Drop the constraints at the start of the load then add them back in after the load completes. If any fail to build, that tells you immediately where you may have some data quality problems or your model is wrong (or something else changed).
  2. Build all the constraints as DISABLE NOVALIDATE. This puts them in the database for the BI tools and data modeling tools to see and capture but, since they are not enforced, they put minimal overhead on the load process. And, so I am told by those that know, even a disabled constraint helps the optimizer make a smarter choice on the join path.
  3. (really 2a) Best of both – disable the constraints, load your data, then re-enable the constraints. You get optimization and quality checks.

So NOW what is your reason for not using FKs in your data warehouse?

Happy Modeling!

Kent

Post Navigation

%d bloggers like this: