The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “#SQLDevModeler”

Better Data Modeling: The Oracle Data Warrior Speaks!

Looks like I will be doing a bit of speaking this year at a number of  events around the country, and later on, the globe (more on that later).

As usually all my talks will center around using SQL Developer Data Modeler, data modeling standards, Data Vault, agile, or a combination of all of the above.

If you have budget and time, please come out to at least one of these events this year, I would love to meet you in person and talk about the world of Oracle and Data Modeling.

If you aren’t planning to attend one of these – WHY NOT?

These are all great events with tons of learning opportunities. The networking alone is worth the price of admission.

Here is a list of the first three events confirmed on my calendar (and SURPRISE – they are NOT all Oracle related events):

RMOUG Training Days

In less than two weeks: The Rocky Mountain Oracle Users Group Training Days 2014 in Denver, Colorado. This runs from Feb 5-7 , will have at least 1,000 people and you cannot beat the price.I will be presenting Friday at 1:30PM on how I save my clients big $$ by applying repeatable processes and standards to my data models.

Follow it on twitter with #RMTD14.

Data Vault Consortium

Next up March 20 – 22nd , I will be participating in the 1st ever World Wide Data Vault Consortium and User Group meetup in beautiful northern Vermont near the home of my good friend, the inventor of the The Data Vault Model and Methodology, Dan Linstedt. I will be speaking about agile and data warehousing, using SDDM to do Data Vault modeling, and no doubt engaging in some lively debates with Data Vault experts from around the globe. Check out the agenda on the event page for more details on who will be speaking (hint: Bill Inmon, father of  data warehousing is participating!).

Enterprise Data World 2014

The #EDW14 event is really the annual conference put on by DAMA International and the speaker list is a veritable who’s-who of the data architecture and modeling world. This year the event is in Austin, Texas on April 27 – May 1. Since that is quite close to where I live, I figured I would submit an abstract and I was honored to be accepted. I have attended this event only once before when it was in Denver (a long time ago!) and have been a member of DAMA on and off for years, but this is the first time I have been asked to speak. I am looking forward to it for sure (not sure how I will fit my talk into a 45 minute slot!). Sign up for it here.

If you are planning to attend any of these, drop me a line over Twitter or LinkedIn so we can plan to meet up.

Later.

Kent

The Oracle Data Warrior

Better Data Modeling: Color Code Your Data Model Diagrams using #SQLDevModeler

One of the standards I recommend in my book  Check List for Doing Data Model Design Reviews is to use color in your diagrams to visually differentiate types of entities or tables.

As luck would have it, Oracle SQL Developer Data Modeler has a feature that makes this very easy. It is Classification Types.

In the latest version. 4.0, you set these up by going to the context menu on the Design level. From that menu pick Properties. Once on the property dialog go to Settings -> Diagram -> Classification Types. (In 3.x look under Tools -> Preferences)

The default install comes with a bunch already – fact, dimensions, logging, summary, and temporary. Each has a pre-set color assigned. You can change that color by clicking on the color and selecting another option from the pallet. You can also set a prefix for each type. (Note – if you are already using a classification and change the color, when you hit apply the new color will be applied in all existing diagrams within the design.)

You add new types by clicking the green plus (+) sign and then just add in whatever you want and save.

For Data Vault modeling, I add three new types: Hub, Link, and Satellite with the colors you see in the screen shot here.

Using Classification Types to Color Code Your Diagrams

Using Classification Types to Color Code Your Diagrams

To apply a classification type to an existing table, open the table property dialog and look for the classification types node in the tree (in 4.0). In 3.x, there is a simple classification type drop down on the main property page.

Once applied, the first letter of the classification type will appear in the upper left corner of the table (see screen shot).

Another way I have used this recently was in my current data warehouse project where I have source, stage, and dimensional tables all in one design. I found I often want to show all three tiers in on diagram (sub view) for a sprint (we are using a SCRUM approach) to help the ETL programmers and QA folks have one place to go where they can see how these layers are related. So for this project, I also added a source and stage classification type.

So if you have been color coding you diagrams by hand, this tip should save you a bunch of time since you won’t have to pick the colors by hand on each table. Plus the color selection will be more consistent.

If you aren’t color coding, now would be a great time to start!

Bonus Tip: If, like me, you want to be consistent across all your designs with the types and colors, I just figured out I can hack the dl_settings.xml file to copy my classification type customizations from one design to another. Just be sure to exit and then restart SDDM after you update the file for it to take effect.

Have fun coloring your diagram! (Maybe more people will read them)

Kent

The Oracle Data Warrior

 

Better Data Modeling: The best FREE data modeling tool just got better!

Yes, it true Virginia, the is a Santa Claus!

And this year Santa brought you a new, improved version of the best FREE data modeling tool in the known universe: Oracle SQL Developer Data Modeler 4.0.

The team at Oracle went all out this year and produced three (yes three) pre-release versions to make sure all the fixes and new features were rock solid before they called it production.

That is a lot of testing and work.

But worth the effort – they fixed piles of bugs and added dozens of new features.

Oracle product manager Jeff Smith (@thatjeffsmith) has already published several articles highlighting his favorite new features. Check out what he has to say here then go download the new version and give it try.

Let me know what you your favorite new feature is.

Merry Christmas!

The Oracle Christmas Elf

(Kent)

P.S. Without proven methods and standards, even the best tool will not insure you build the best model, so why not increase your chances by giving yourself the gift of knowledge by picking up a copy of my data model checklist book (on sale for a few more hours).

 

Agile Data Warehouse Modeling: How to Build a Virtual Type 2 Slowly Changing Dimension

One of the ongoing complaints about many data warehouse projects is that they take too long to delivery. This is one of the main reasons that many of us have tried to adopt methods and techniques (like SCRUM) from the agile software world to improve our ability to deliver data warehouse components more quickly.

So, what activity takes the bulk of development time in a data warehouse project?

Writing (and testing) the ETL code to move and transform the data can take up to 80% of the project resources and time.

So if we can eliminate, or at least curtail, some of the ETL work, we can deliver useful data to the end user faster.

One way to do that would be to virtualize the data marts.

For several years Dan Linstedt and I have discussed the idea of building virtual data marts on top of a Data Vault modeled EDW.

In the last few years I have floated the idea among the Oracle community. Fellow Oracle ACE Stewart Bryson and I even created a presentation this year (for #RMOUG and #KScope13) on how to do this using the Business Model (meta-layer) in OBIEE (It worked great!).

While doing this with a BI tool is one approach, I like to be able to prototype the solution first using Oracle views (that I build in SQL Developer Data Modeler of course).

The approach to modeling a Type 1 SCD this way is very straight forward.

How to do this easily for a Type 2 SCD has evaded me for years, until now.

Building a Virtual Type 2 SCD (VSCD2)

So how to create a virtual type 2 dimension (that is “Kimball compliant” ) on a Data Vault when you have multiple Satellites on one Hub?

(NOTE: the next part assumes you understand Data Vault Data Modeling. if you don’t, start by reading my free white paper, but better still go buy the Data Vault book on LearnDataVault.com)

Here is how:

Build an insert only PIT (Point-in-Time) table that keeps history. This is sometimes referred to as a historicized PIT tables.  (see the Super Charge book for an explanation of the types of PIT tables)

Add a surrogate Primary Key (PK) to the table. The PK of the PIT table will then serve as the PK for the virtual dimension. This meets the standard for classical star schema design to have a surrogate key on Type 2 SCDs.

To build the VSCD2 you now simply create a view that uses the PIT table to join the Hub and all the Satellites together. Here is an example:

Create view Dim2_Customer (Customer_key, Customer_Number, Customer_Name, Customer_Address, Load_DTS)
as
Select sat_pit.pit_seq, hub.customer_num, sat_1.name, sat_2.address, sat_pit.load_dts
from HUB_CUST hub,        
          SAT_CUST_PIT sat_pit,        
          SAT_CUST_NAME sat_1,        
          SAT_CUST_ADDR sat_2
where  hub.CSID = sat_pit.CSID           
    and hub.CSID = sat_1.CSID           
    and hub.CSID = sat_2.CSID           
    and sat_pit.NAME_LOAD_DTS = sat_1.LOAD_DTS           
    and sat_pit.ADDRESS_LOAD_DTS = sat_2.LOAD_DTS 
 

Benefits of a VSCD2

  1. We can now rapidly demonstrate the contents of a type 2 dim prior to ETL programming
  2. With using PIT tables we don’t need the Load End DTS on the Sats so the Sats become insert only as well (simpler loads, no update pass required)
  3. Another by product is the Sat is now also Hadoop compliant (again insert only)
  4. Since the nullable Load End DTS is not needed, you can now more easily partition the Sat table by Hub Id and Load DTS.

Objections

The main objection to this approach is that the virtual dimension will perform very poorly. While this may be true for very high volumes, or on poorly tuned or resourced databases, I maintain that with today’s evolving hardware appliances  (e.g., Exadata, Exalogic) and the advent of in memory databases, these concerns will soon be a thing of the past.

UPDATE 26-May-2018  – Now 5 years later I have successfully done the above on Oracle. But now we also have Snowflake elastic cloud data warehouse where all the prior constraints are indeed eliminated. With Snowflake you can now easily chose to instantly add compute power if the view is too slow or do the work and processing to materialize the view. (end update)

Worst case, after you have validated the data with your users, you can always turn it into a materialized view or a physical table if you must.

So what do you think? Have you ever tried something like this? Let me know in the comments.

Get virtual, get agile!

Kent

The Data Warrior

P.S. I am giving a talk on Agile Data Warehouse Modeling at the East Coast Oracle Conference this week. If you are there, look me up and we can discuss this post in person!

Better Data Modeling: New and Improved Oracle SQL Developer Data Modeler (#SQLDevModeler)

Yup, my friends at Oracle have been hard at working enhancing what was already the best FREE data modeling tool out there.

They just released SDDM R4 EA3! You can go get it right now: http://www.oracle.com/technetwork/developer-tools/datamodeler/downloads/datamodeler-4ea-downloads-1988443.html

As always there are both new features and bug fixes.

One of the coolest new features is the ability to show entity (or table) comments right on the diagram in the object. This will be very useful for enabling data model reviews with the business users.

Product manager Ashley tweeted and example the other day:

 

For even more details and ideas how to use this feature check out Jeff Smith’s post on the feature here.

So what are you waiting for? Go get it today!

Data Modeling is Fun!

Later

Kent
The Oracle Data Warrior

Post Navigation