The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the category “Data Modeling”

Data Vault Training – Europe & US

Lots of training classes on Data Vault 2.0 are on the horizon around the world thanks to a new venture that Dan Linstedt and Michael Olschimke have formed called ScaleFree. In addition to world class consulting, they are offering dozens of Data Vault 2.0 related classes, including Bootcamps, Certification, and Introduction to Data Vault Modeling.

Here are the of the upcoming classes for 2017:

So no excuses in 2017 – there are all these classes plus whatever other classes Dan manages to fit into his schedule.

Go learn Data Vault 2.0 and improve your data warehouse success rate!

Cheers,

Kent

The Data Warrior

P.S. Don’t forget to join use at the 4th Annual WWDVC in beautiful Stowe, Vermont!

wwdvc2017

Advertisements

RI (Referential Integrity) Constraints: 3 Reasons to Include Them in Your Data Warehouse

Over the years, I have had numerous conversations about the value of having referential integrity (RI) constraints, such as primary and foreign keys, in a relational data warehouse or data mart.

Many DBAs object that RI constraints slow the load process. This is a valid point if you are talking about enforced constraints that are checked in real-time during the load. But this is not an issue if you define the constraints as disabled.

Which then leads to this common question:

Is there any reason to maintain a permanently disabled FK in the data model?  If it is not going to be enabled, then from my perspective, it doesn’t make any sense to define the FK.  Instead, the relationship can be described in the comment of the child column.

So, why would I want RI constraints in my data warehouse?

Here are 3 reasons to consider…RI (Referential Integrity) Constraints: 3 Reasons to Include Them in Your Data Warehouse

Model on!

Kent

The Data Warrior

Agile Amped Video – Live from Southern Fried Agile

A few weeks back I had the pleasure of attending and speaking at the Southern Fried Agile event in Charlotte, North Carolina.

It was quite an event with over 700 people in attendance. This is a primarily local event drawing people mostly from the Charlotte area. To say agile is big in that town is an understatement. There were some very large, name brand, companies present. If you are interested in learning more about taking an agile approach to development, I can highly recommend this event which draws excellent speakers (and vendors) from all over. (And the traditional southern fried lunch was pretty darn good too!)

southernfriedlunch

My talk was on applying agile methods to data warehousing projects. I had a great audience of about 50 who wanted to learn how they might approach their DW projects differently.

In addition to giving my talk I got the opportunity to participate in my 2nd Agile Amped interview. This time my interviewer was Leslie from SolutionsIQ. We talked a lot about the rise of agile in the data warehousing space and how new technology solutions like Snowflake’s Cloud-Native Elastic Data Warehouse are helping organizations achieve a more agile deployment cycle.

Here is the entire interview for your viewing pleasure:

If you want to watch more Agile Amped video podcasts you can find them online here.

Enjoy.

Kent

The Data Warrior

Drill to Detail Podcast: Data Modeling, Data Vault, and Snowflake!

My good friend Mark Rittman has embarked on a new adventure as an independent analyst and consultant. As part of his new venture Mark started a new podcast series on iTunes that he calls Drill to Detail where he will feature interviews discussing a range of topics related to data warehousing, business intelligence, analytics, and big data.

I was honored to be asked to take part in this new venture and got to spend a hour with Mark a few weeks back recording what is now Episode 5 of the series. In this interview we talk about:

The podcast is about 60 minutes with each topic being about 20 minutes (so feel free to skip ahead if you are short on time). Please have a listen and let us know what you think in the comments below.

Cheers!

Kent

The Data Warrior

P.S. I will be speaking on these and related topics at a bunch of events over the next few weeks. Check out my speaking schedule and join me in person if you can!

Maintaining disabled FK’s, wisdom or farce?

A while back, I wrote a post about having FKs (foreign keys) in your data warehouse.

Well, a similar question came up recently on an Oracle forum with the above title. It is a fair question and it does surface fairly regularly in a variety of contexts (not just data warehousing).

Of course, as The Data Warrior, I felt is was my duty to respond.

The Question

Is there any reason to maintain a permanently disabled FK in the data model?  I’m not envisioning a reason to do it.  If it is not going to be enabled, then from my perspective, it would not make any sense to have it defined.  If anything, provide the definition of the relationship in the comment of the child column.

My Answer

Yes, by all means keep the FK please!

I see three good reasons for doing so:

  1. It is valuable metadata (& documentation). If somebody reverse engineers the database (say with ERWin or Oracle Data Modeler), the FK shows up in the diagram (way better than having to read a column comment to find out)
    Data Vault 2.0 Example

    A picture is worth a thousand words!

    .

  2. BI Metadata – If you want to use any sort of reporting or BI tool against the database, most tools will import the FK definition with the tables and build the proper join conditions. Way better than having someone guess what the join will be and then manually adding it to the metadata layer in the reporting tool. Examples that can read the Oracle data dictionary include OBIEE, Business Objects, COGNOS, Looker, and many others.(Note here that since the FK is not enforced on the remote databases, you might want to make sure these are treated as outer joins, lest you lose some transaction in the reports).
  3. The Oracle optimizer will use disabled constraints to improve query performance of joins. Again, this is metadata in the data dictionary which the optimizer can read. This is documented in the Oracle Data Warehouse guide and I have validated it on multiple occasions with Oracle product management.

While #3 applies specifically to Oracle, for other databases like MS SQL Server and Snowflake, #1 and #2 still apply.

Even if only one of the above is true for a given database, that, in my opinion, still justifies keeping the disabled constraint around.

Final Answer = Wisdom

What do you think? Feel free to comment below.

And please share on your favorite social media platform!

Model on!

Kent

The Data Warrior

 

Post Navigation

%d bloggers like this: