Archive for the tag “#BetterDataModeling”

18 Oct 2016

Drill to Detail Podcast: Data Modeling, Data Vault, and Snowflake!

My good friend Mark Rittman has embarked on a new adventure as an independent analyst and consultant. As part of his new venture Mark started a new podcast series on iTunes that he calls Drill to Detail where he will feature interviews discussing a range of topics related to data warehousing, business intelligence, analytics, and big data.

I was honored to be asked to take part in this new venture and got to spend a hour with Mark a few weeks back recording what is now Episode 5 of the series. In this interview we talk about:

The need and relevance of data modeling in the big data world (which I wrote about recently)
The Snowflake Elastic Data Warehouse and how it is changing the world of cloud data warehousing
The Data Vault modeling methodology and how it helps organizations be agile

The podcast is about 60 minutes with each topic being about 20 minutes (so feel free to skip ahead if you are short on time). Please have a listen and let us know what you think in the comments below.

Cheers!

Kent

The Data Warrior

P.S. I will be speaking on these and related topics at a bunch of events over the next few weeks. Check out my speaking schedule and join me in person if you can!

31 Aug 2016

10 Comments

Maintaining disabled FK’s, wisdom or farce?

A while back, I wrote a post about having FKs (foreign keys) in your data warehouse.

Well, a similar question came up recently on an Oracle forum with the above title. It is a fair question and it does surface fairly regularly in a variety of contexts (not just data warehousing).

Of course, as The Data Warrior, I felt is was my duty to respond.

The Question

Is there any reason to maintain a permanently disabled FK in the data model? I’m not envisioning a reason to do it. If it is not going to be enabled, then from my perspective, it would not make any sense to have it defined. If anything, provide the definition of the relationship in the comment of the child column.

My Answer

Yes, by all means keep the FK please!

I see three good reasons for doing so:

It is valuable metadata (& documentation). If somebody reverse engineers the database (say with ERWin or Oracle Data Modeler), the FK shows up in the diagram (way better than having to read a column comment to find out)
A picture is worth a thousand words!

.
BI Metadata – If you want to use any sort of reporting or BI tool against the database, most tools will import the FK definition with the tables and build the proper join conditions. Way better than having someone guess what the join will be and then manually adding it to the metadata layer in the reporting tool. Examples that can read the Oracle data dictionary include OBIEE, Business Objects, COGNOS, Looker, and many others.(Note here that since the FK is not enforced on the remote databases, you might want to make sure these are treated as outer joins, lest you lose some transaction in the reports).
The Oracle optimizer will use disabled constraints to improve query performance of joins. Again, this is metadata in the data dictionary which the optimizer can read. This is documented in the Oracle Data Warehouse guide and I have validated it on multiple occasions with Oracle product management.

While #3 applies specifically to Oracle, for other databases like MS SQL Server and Snowflake, #1 and #2 still apply.

Even if only one of the above is true for a given database, that, in my opinion, still justifies keeping the disabled constraint around.

Final Answer = Wisdom

What do you think? Feel free to comment below.

And please share on your favorite social media platform!

Model on!

Kent

The Data Warrior

Posted in Data Modeling and tagged #BetterDataModeling, #DataWarrior, #SQLDevModeler, best practice, data model, data model design, Data Modeling, Data Warehouse, foreign keys, from the author of A Checklist for Doing Data Model Design Reviews, Oracle

25 Jan 2016

5 Comments

Are You Certifiable? 1st #DataVault 2.0 Bootcamp of the Year

A quick note for all the folks out there that have been contemplating diving deep into Dan Linstedt’s Data Vault 2.0 System of Business Intelligence.

Dan will be teaching a Data Vault 2.o Bootcamp in February! You can sign up here.

You’ve read the articles, read the blog posts (mine included), attended the talks at the conferences, maybe even read the Super Charge book…

Are you done trying to figure it out on your own?

Ready to not only learn how to do it right, but get certified as a Data Vault 2.0 Practitioner?

Well let’s get 2016 off to a great start and attend the 1st Data Vault 2.0 Bootcamp of 2016 in beautiful St. Albans, Vermont, taught by none other than the inventor of Data Vault, my good friend Dan Linstedt.

You could of course just buy the new book, and try it out on your own…

But if you are like me, you do much better when you can interact, face-to-face with a qualified instructor, ask the hard questions, and get the insights that will make you truly successful.

So why not invest in yourself and your future success? Go sign up now.

As an added incentive, Dan has added some brand new material.

NEW TOPICS

Dan will be discussing DV2 on Hive / Hadoop, the benefits, pros and cons, some suggestions on how to build it and leverage it properly. He will be talking about Satellites on HDFS, Hubs & Links on Hive. He will discuss data modeling implications, and using SERDe definitions at query time. This is the first time ever that this information will be presented in the DV2 class!

Make the commitment to a great 2016 now and go sign up before the class fills up. If you sign up before February 1st, you can save over $400!

To your success!

Kent

The Data Warrior

Data Vault Master and CDVP2

P.S. For you skiers, St. Albans is a short drive to both Stowe and Smuggler’s Notch – both great east coast ski areas and with the snow they just got the skiing will be epic. Go take the class, then reward yourself with a little weekend ski trip.

Posted in Data Modeling, Data Vault, Data Warehouse and tagged #AgileDataWarehouse, #BetterDataModeling, #datavault, @dlinstedt, agile, agile data vault, bootcamp, DAD, Data Warehouse, data warehouse design

14 Jan 2016

Better Data Modeling: Customizing Oracle Sql Developer Data Modeler (#SQLDevModeler) to Support Custom Data Types

On a recent customer call (for Snowflake), the data architects were asking if Snowflake provided a data model diagramming tool to design and generate data warehouse tables or to view a data model of an existing Snowflake data warehouse. Or if we knew of any that would work with Snowflake.

Well, we do not provide one of our own – our service is the Snowflake Elastic Data Warehouse (#ElasticDW).

The good news is that there are data modeling tools in the broader ecosystem that you can of course use (since we are ANSI SQL compliant).

…

If you have read my previous posts on using JSON within the Snowflake, you also know that we have a new data type called VARIANT for storing semi structured data like JSON, AVRO, and XML.

In this post I will bring it together and show you the steps to customize SDDM to allow you to model and generate table DDL that contain columns that use the VARIANT data type.

Read the details of how I did it here on my Snowflake blog:

Snowflake SQL: Customizing Oracle Sql Developer Data Modeler (SDDM) to Support Snowflake VARIANT – Snowflake

Enjoy!

Kent

The Data Warrior

P.S. If you are in Austin, Texas this weekend, I will be speaking at Data Day Texas (#DDTX16). Snowflake will have a booth there too, so come on by and say howdy!

Posted in Data Modeling, Data Warehouse, SnowflakeDB, SQL Developer Data Modeler and tagged #BetterDataModeling, #SnowflakeDB, #SQLDevModeler, @SnowflakeDB, data model design, Data Modeling, Data Warehouse, data warehouse design, data warehousing, from the author of A Checklist for Doing Data Model Design Reviews, Oracle SQL Developer Data Modeler | Leave a comment

29 Dec 2015

1 Comment

Better Data Modeling: Discovering Foreign Keys (FK) in #SQLDevModeler (SDDM)

A while back I had an interesting situation when I was attempting to reverse engineer and document a database for a client.

I had a 3rd party database that had PKs defined on every table but no FKs in the database. The question I posed (on the Data Modeler Forum) was:

How do I get the FK Discover utility to find FK columns with this type of pattern:

Parent PK column = TABCUSTNUM

Child FK column = ABCCUSTNUM

So the root column name (CUSTNUM) is standard but in every table the column name has a different 3 character “prefix” that is effectively the table short name. Is there way to get the utility to ignore the first three characters of the column names?

This was in SDDM 4.1.873.

No easy answer.

Well, the ever helpful Philip was very kind and wrote me a slick custom Transformation Script that did the trick! (Check the post if you want to see the code.)

But wait there’s more!

In his response he mentioned a feature coming in 4.1.888 – the ability to include a table prefix as part of a FK column naming template (just like this app had done).

Cool, I thought, but how does that help?

Well with the template in place it turns out that you can have the FK Discovery utility search based on the Naming Template model rather than just look for exact matching column names.

Using the Custom Naming Template

So recently (today in fact) I was trying to add FKs to the Snowflake DB model I reverse engineered a few weeks back (Jeff pointed out they were missing). I noticed the model had that pattern of a prefix on both the FK and PK column names.

In the CUSTOMER table the PK is C_CUSTKEY. In the ORDER table it is O_CUSTKEY. Nice simple pattern (see the diagram below for more). That reminded me of the previous issue and Philip’s script.

Snowflake Schema

Off to OTN to find that discussion and refresh my memory.

In the post, Philip has posted an example of a template that fit my previous problem:

{table abbr}SUBSTR(4,30,FRONT,{ref column})

With the note that {table abbr} would be the equivalent of what I called the table prefix. So first I went to the table properties and put in the prefixes using the Abbreviation property:

Add Table Abbrev

Then all I had to do was modify his example to account for the underscore and the fact that the main column text would start at character #3 instead of #4:

{table abbr}_SUBSTR(3,30,FRONT,{ref column})

I input that by going to Properties -> Settings -> Naming Standards -> Templates and then editing the default template:

Set up FK template

Discover FKs!

Now it was just a matter of running the utility. You find that with a right mouse click on the Relational Design node:

Discover FK tool

Next you get the list of candidate FKs:

Create FKs

Note that the utility also suggested some FKs based on the unique constraints (UKs) as well. I did not want those, so I unchecked them before I hit “OK”.

The result was getting all the FKs I wanted added into my model! Viola!

Snowflake with RI

So I can happily report that Philip’s little enhancement works just fine in 4.1.3. WooHoo! I can see this being very useful for a lots of cases in the future.

In a future post (early next year), I will continue with showing how we implemented Referential Integrity constraints in Snowflake DB and if I can generate the DDL from #SQLDevModeler.

Happy New Year Y’all

Kent

The Data Warrior & Snowflake Technical Evangelist

Posted in Data Modeling, SnowflakeDB, SQL Developer Data Modeler and tagged #BetterDataModeling, #SnowflakeDB, #SQLDevModeler, @SnowflakeDB, @thatjeffsmith, data model design, Discover FK, FKs, Oracle Data Modeler., Oracle SQL Developer Data Modeler, SQL Developer Data Modeler

The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “#BetterDataModeling”

Drill to Detail Podcast: Data Modeling, Data Vault, and Snowflake!

Maintaining disabled FK’s, wisdom or farce?

The Question

My Answer

Final Answer = Wisdom