The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “#techtip”

Schema-on-what? How to model JSON

It seems hard to believe, but all year, around the world, I continue to have this conversation on whether or not we still need data modeling.

I know! Crazy!

Thought we were past that…

As I have said before,

Schema-on-read has the word SCHEMA in it!

So instead of continuing to rant about it, I decided to put together a talk to show people, graphically, what I meant by decomposing, step by step, a few JSON documents into real data models. For the sake of the talk I decided to go with 3NF and Data Vault styles to make my point.

This talk has been very well received so I decided I would share it a bit more publicly by posting it here on my blog.

 

Now that you can see how to model JSON, check out my Snowflake ebook on how to easily analyze JSON using SQL.

If you know any meet-ups or conferences that I should be giving this talk at, please let me know. Or check out my speaking schedule for 2019 and join me at one of the events already on my calendar. (1st up is ITOUG in Milano!)

Ciao!

Kent

The Data Warrior & Chief Evangelist at Snowflake

P.S. There was no magic, or built-in wizard, to creating the models. I did it all by hand using Oracle Sql Developer Data Modeler.

 

Advertisements

Cloud Analytics Conference – London!

Next up on The Data Warrior speaking tour 2017 is the Snowflake Cloud Analytics Conference in London on June 1st!

CloudConference

Snowflake is kicking off this year’s Cloud Analytics City Tour with a blow out event in London, England. This will be a full day workshop style event where you get to hear and learn from industry veterans and thought leaders like myself, and the CEO of Snowflake Computing, Bob Muglia (to name just a few). In addition we will have a Practitioner Panel discussion that includes several of our customers along with other industry thought leaders.

The unique value proposition for this event is that in the afternoon you can choose from two tracks of in depth sessions related to implementing your BI solutions and your data warehouse in the cloud.

I will be presenting my talk Agile Methods and Data Warehousing: How to Deliver Faster. My highly seasoned colleagues from Snowflake (all industry experts) will teach you about loading data in the cloud, deploying BI in the cloud, and how to best use Snowflake to be successful with your cloud analytics program.

And of course there will be food, drinks, and networking.

You can find all the agenda details here along with the registration form. Use discount code DATAWARRIOR for 50% off the registration fee.  Sign up today!

This will be my first time ever in London, so if you are in the area, please come by, say “hi” and learn about the new world of Cloud Analytics.

Until then, cheers!

Kent

The Data Warrior

P.S. I will be in London the day before and after the event, so if you want to have a more detailed or personalized discussion of the benefits of cloud-native data warehousing, please reach out to me at kent.graziano@snowflake.net.

Better Data Modeling: The Book

Trying to be as productive as possible during my infrequent down time, I just published another Kindle book with some of my best tips for Oracle Data Modeler. it is called Better Data Modeling: Tips for Enhancing Your Use of Oracle SQL Developer Data Modeler.

If you are one of the 3.5 million users (or so) who have downloaded this tool, and you want to know my little secrets for getting the most out of SQL Developer Data Modeler (#SQLDevModeler), this book if for you.

If you were an Oracle Designer user and are looking for a replacement data modeling tool, or you are using one of the other mainstream, expensive modeling tools and want a more cost effective alternative, then you owe it to yourself to look at Oracle SQL Developer Data Modeler (SDDM). Oracle Data Modeler has been around for over five years now and is up to version 4.1. It really is an industrial strength data modeling tool that can be used for any data modeling task you need to tackle.

SQL Developer Data Modeler (SDDM) is a fully functional tool provided for FREE by Oracle. It has many features built in that can be leveraged to capture the design of an existing (probably undocumented) database or you can use it to design a new database, even a data warehouse from scratch. There are a load of great features. This book will show you my favorite features along with detailed step by step instructions (with screen shots) on how to use them.

Tips include:

  • How to easily color code your diagrams
  • How to make hundreds of views really fast
  • How to find missing foreign keys
  • How to find missing unique keys
  • How to connect to a SQL Server database (if you must…)

As a bonus, there are two appendices with my run down on common data modeling mistakes and my famous rant on why you need foreign keys in your data warehouse.

So if you don’t use Oracle Data Modeler yet, read my book to see why you should.

If you do use it, I hope this little book will make you even more productive than you already are!

Model on!

Kent

The Data Warrior

P.S. After you read the book, please leave a review on Amazon to help other folks decide if the book is for them.

Better Data Modeling: My Top 3 Reasons why you should put Foreign Keys in your Data Warehouse

This question came up at the recent World Wide Data Vault Consortium. Seems there are still many folks who build a data warehouse (or data mart) that do not include FKs in the database.

The usual reason is that it “slows down” load performance.

No surprise there. Been hearing that for years.

And I say one of two things:

1. So what! I need my data to be correct and to come out fast too!

or

2. Show me! How slow is it really?

Keep in mind that while getting the data in quickly is important, so is getting the data out.

Who would you rather have complain – the ETL programmer or the business user trying to run a report?

Yes, it has to be a balance, but you should not immediately dismiss including FKs in your warehouse without considering the options and benefits of those options.

So here are my three main reasons why you should include FK constraints in your Oracle data warehouse database:

  1. The Oracle optimizer uses the constraints to make better decisions on join paths.
  2. Your Data Modeling and BI tools can read the FKs from the data dictionary to create correct joins in the meta data of the tool (SDDM, Erwin, OBIEE, Cognos, Bus Objects can all do this).
  3. It is a good QA check on your ETL. (Yeah, I know… the ETL code is perfect and checks all that stuff, bla, bla, bla)

Now of course there are compromise options. The three main ones are I know:

  1. Drop the constraints at the start of the load then add them back in after the load completes. If any fail to build, that tells you immediately where you may have some data quality problems or your model is wrong (or something else changed).
  2. Build all the constraints as DISABLE NOVALIDATE. This puts them in the database for the BI tools and data modeling tools to see and capture but, since they are not enforced, they put minimal overhead on the load process. And, so I am told by those that know, even a disabled constraint helps the optimizer make a smarter choice on the join path.
  3. (really 2a) Best of both – disable the constraints, load your data, then re-enable the constraints. You get optimization and quality checks.

So NOW what is your reason for not using FKs in your data warehouse?

Happy Modeling!

Kent

Better Data Modeling: Finding Missing Unique Keys in Oracle #SQLDevModeler

One of the best practices I recommend is to always define unique business keys for every entity (or table) in a model.

It is the only way to really understand what the data in that object represents.

So what do you do when you inherit someone else’s model with hundreds of tables and few (if any) unique keys to be found?

After you reverse engineer it into SDDM (SQL Developer Data Modeler), you could go through the model table by table and look at the properties.

Or, you could look at all the diagrams to look for the the little U’s indicating a column is part of a unique key constraint (assuming there are any diagrams to look at).

Or you could create a Custom Design Rule that checks for you.

So how do you write a design rule that will list all tables with no UKs on them?

Open your design, the go to Tools -> Design Rules -> Custom Rules.

  1. Hit the green Plus sign to add a new rule.
  2. Give it a name (like Missing UKs),
  3. Select Table for the object type,
  4. Mozilla Rhino for the Engine,
  5. Warning for the type, and
  6. Select table as the variable
  7. Past in this code: 
function checkUKs(table){
ruleMessage=””;
if(table.getUKeys().size() == 0){
  ruleMessage=”no UKs”;
  errType=”Problem:”;
  return false;
} else {
  return true;
}
}
checkUKs(table);

Hit Save, then Apply.

The result will be a list of all the tables in your design that do not have any Unique Key Constraints defined.

Now the real work begins – fixing those tables! As you work your way through the model adding the new business keys, you can keep using this report to see which ones you have left, and make sure you don’t miss any.

Get to it my friends!

Kent

The Oracle Data Warrior

P.S. Special thanks to DimitarSlavov  of Oracle for posting the code to answer my question. If you want to see the whole thread go here.

Post Navigation

%d bloggers like this: