The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “Data Vault”

Better Data Modeling: The Oracle Data Warrior Speaks!

Looks like I will be doing a bit of speaking this year at a number of  events around the country, and later on, the globe (more on that later).

As usually all my talks will center around using SQL Developer Data Modeler, data modeling standards, Data Vault, agile, or a combination of all of the above.

If you have budget and time, please come out to at least one of these events this year, I would love to meet you in person and talk about the world of Oracle and Data Modeling.

If you aren’t planning to attend one of these – WHY NOT?

These are all great events with tons of learning opportunities. The networking alone is worth the price of admission.

Here is a list of the first three events confirmed on my calendar (and SURPRISE – they are NOT all Oracle related events):

RMOUG Training Days

In less than two weeks: The Rocky Mountain Oracle Users Group Training Days 2014 in Denver, Colorado. This runs from Feb 5-7 , will have at least 1,000 people and you cannot beat the price.I will be presenting Friday at 1:30PM on how I save my clients big $$ by applying repeatable processes and standards to my data models.

Follow it on twitter with #RMTD14.

Data Vault Consortium

Next up March 20 – 22nd , I will be participating in the 1st ever World Wide Data Vault Consortium and User Group meetup in beautiful northern Vermont near the home of my good friend, the inventor of the The Data Vault Model and Methodology, Dan Linstedt. I will be speaking about agile and data warehousing, using SDDM to do Data Vault modeling, and no doubt engaging in some lively debates with Data Vault experts from around the globe. Check out the agenda on the event page for more details on who will be speaking (hint: Bill Inmon, father of  data warehousing is participating!).

Enterprise Data World 2014

The #EDW14 event is really the annual conference put on by DAMA International and the speaker list is a veritable who’s-who of the data architecture and modeling world. This year the event is in Austin, Texas on April 27 – May 1. Since that is quite close to where I live, I figured I would submit an abstract and I was honored to be accepted. I have attended this event only once before when it was in Denver (a long time ago!) and have been a member of DAMA on and off for years, but this is the first time I have been asked to speak. I am looking forward to it for sure (not sure how I will fit my talk into a 45 minute slot!). Sign up for it here.

If you are planning to attend any of these, drop me a line over Twitter or LinkedIn so we can plan to meet up.

Later.

Kent

The Oracle Data Warrior

Agile Data Warehouse Modeling: How to Build a Virtual Type 2 Slowly Changing Dimension

One of the ongoing complaints about many data warehouse projects is that they take too long to delivery. This is one of the main reasons that many of us have tried to adopt methods and techniques (like SCRUM) from the agile software world to improve our ability to deliver data warehouse components more quickly.

So, what activity takes the bulk of development time in a data warehouse project?

Writing (and testing) the ETL code to move and transform the data can take up to 80% of the project resources and time.

So if we can eliminate, or at least curtail, some of the ETL work, we can deliver useful data to the end user faster.

One way to do that would be to virtualize the data marts.

For several years Dan Linstedt and I have discussed the idea of building virtual data marts on top of a Data Vault modeled EDW.

In the last few years I have floated the idea among the Oracle community. Fellow Oracle ACE Stewart Bryson and I even created a presentation this year (for #RMOUG and #KScope13) on how to do this using the Business Model (meta-layer) in OBIEE (It worked great!).

While doing this with a BI tool is one approach, I like to be able to prototype the solution first using Oracle views (that I build in SQL Developer Data Modeler of course).

The approach to modeling a Type 1 SCD this way is very straight forward.

How to do this easily for a Type 2 SCD has evaded me for years, until now.

Building a Virtual Type 2 SCD (VSCD2)

So how to create a virtual type 2 dimension (that is “Kimball compliant” ) on a Data Vault when you have multiple Satellites on one Hub?

(NOTE: the next part assumes you understand Data Vault Data Modeling. if you don’t, start by reading my free white paper, but better still go buy the Data Vault book on LearnDataVault.com)

Here is how:

Build an insert only PIT (Point-in-Time) table that keeps history. This is sometimes referred to as a historicized PIT tables.  (see the Super Charge book for an explanation of the types of PIT tables)

Add a surrogate Primary Key (PK) to the table. The PK of the PIT table will then serve as the PK for the virtual dimension. This meets the standard for classical star schema design to have a surrogate key on Type 2 SCDs.

To build the VSCD2 you now simply create a view that uses the PIT table to join the Hub and all the Satellites together. Here is an example:

Create view Dim2_Customer (Customer_key, Customer_Number, Customer_Name, Customer_Address, Load_DTS)
as
Select sat_pit.pit_seq, hub.customer_num, sat_1.name, sat_2.address, sat_pit.load_dts
from HUB_CUST hub,        
          SAT_CUST_PIT sat_pit,        
          SAT_CUST_NAME sat_1,        
          SAT_CUST_ADDR sat_2
where  hub.CSID = sat_pit.CSID           
    and hub.CSID = sat_1.CSID           
    and hub.CSID = sat_2.CSID           
    and sat_pit.NAME_LOAD_DTS = sat_1.LOAD_DTS           
    and sat_pit.ADDRESS_LOAD_DTS = sat_2.LOAD_DTS 
 

Benefits of a VSCD2

  1. We can now rapidly demonstrate the contents of a type 2 dim prior to ETL programming
  2. With using PIT tables we don’t need the Load End DTS on the Sats so the Sats become insert only as well (simpler loads, no update pass required)
  3. Another by product is the Sat is now also Hadoop compliant (again insert only)
  4. Since the nullable Load End DTS is not needed, you can now more easily partition the Sat table by Hub Id and Load DTS.

Objections

The main objection to this approach is that the virtual dimension will perform very poorly. While this may be true for very high volumes, or on poorly tuned or resourced databases, I maintain that with today’s evolving hardware appliances  (e.g., Exadata, Exalogic) and the advent of in memory databases, these concerns will soon be a thing of the past.

UPDATE 26-May-2018  – Now 5 years later I have successfully done the above on Oracle. But now we also have Snowflake elastic cloud data warehouse where all the prior constraints are indeed eliminated. With Snowflake you can now easily chose to instantly add compute power if the view is too slow or do the work and processing to materialize the view. (end update)

Worst case, after you have validated the data with your users, you can always turn it into a materialized view or a physical table if you must.

So what do you think? Have you ever tried something like this? Let me know in the comments.

Get virtual, get agile!

Kent

The Data Warrior

P.S. I am giving a talk on Agile Data Warehouse Modeling at the East Coast Oracle Conference this week. If you are there, look me up and we can discuss this post in person!

KScope13 Day Two: Wine to Water and Other Transformations

So day two in New Orleans at the ODTUG KScope13 event was another big day.

I am gong to start out at the end of the day with the General Session update so if you don’t have time to read the whole post you can read the really important and interesting stuff first.

General Session and Keynote

First the fun part, we got greeted by a live New Orleans Jazz band.

We had a live band in the lobby to great attendees before the general session and keynote.

We had a live band in the lobby to greet attendees before the general session and keynote.

That was great fun. They then led us all into the grand ballroom for the general session and then went out and led in our board of directors and the conference committee all dancing up a storm in true NOLA fashion.

The general session gets opened with the board and conference committee being lead on stage marching/dancing behind a live New Orleans Marching Jazz Band

The general session gets opened with the board and conference committee being led on stage marching/dancing behind a live New Orleans Marching Jazz Band

ODTUG Announcements and Award Winners

Every year ODTUG gives out a number of awards so I want to recognize the winners here:

Editors Choice Award for Best White Paper went to David Schleis.

The Oracle Contributor of the Year (which goes to an Oracle Corp employee) went to my good buddy, Jeff Smith.

The ODTUG Volunteer Award went to Mack McCasland who has been working behind the scenes at our events for over 10 years (and he is retired!),

In addition to these awards, Oracle also announced the promotion of my good friend John King to the status of Oracle ACE Director.

The big announcement: KScope14 will be in Seattle, Washington, USA on June 22-26, 2014. The conference hotel will be the Sheraton in downtown Seattle.

Wine to Water

The big deal for the night was our keynote speaker, Doc Hendley. He is a bartender who decided he wanted to make a much bigger impact on the world and ended up founding an organization that now brings clean water to people in over 15 countries.

The statistics he gave us on how many people in the world do not have clean, safe water to drink (over 1 billion!) were stunning. And that more people die from lack of clean water than those that have died in all the recent wars put together. Another startling fact is that even though it is the biggest killer on the planet, dealing with dirty water for the poor of the world gets less than 20% of the funding when compared with funding for HIV, malaria, and TB.

He has a very moving and passionate story about how he got to that place in his life where he found his real purpose, discovered these facts, and set out to do something about it. His talk (and book) tell the whole story. There were a few teary eyes by the end of his talk. Doc has shown amazing courage and perseverance in the pursuit of making a difference.

He really has proven that one, very ordinary person can have a large impact on the lives of others if you really set your mind to it.

I encourage you to go over to his site and learn about his mission, his story, and his organization Wine to Water.

Maybe you can help him make a difference.

Doc Hendley, Founder of Wine to Water, gives a moving and inspirational address as our keynote speaker.

Doc Hendley, Founder of Wine to Water, gives a moving and inspirational address as our keynote speaker.

The rest of my day

So back to earlier it the day (for those still with me here)…

Started as always with my morning chi gung class. The group grew to about 14 people with a few new folks joining us. We had a few passersby stop to watch and try a few moves as well.

After a healthy breakfast and a shower I did my first talk of the event, Five Ways to Make Data Modeling Fun. There were about 20 folks in the session and we all had a good time trying out my ideas.

Then I headed over to hear my friend Jeff Smith talk about SQL Developer (my 2nd favorite Oracle product).

Oracle Senior Product Manager (and ODTUG Oracle Contributor of the Year) shows us his top tips and trick for SQL Developer.

Oracle Senior Product Manager (and ODTUG Oracle Contributor of the Year) shows us his top tips and tricks for SQL Developer

After that is was another awesome lunch (beet salad and redfish!) then on to Mark Rittman’s session about OBIEE, Endeca, and his take on the overall landscape of Oracle BI and data discovery in the new world of NoSQL and Hadoop.

In Mark Rittman's session he talked a bit abut Oracle's strategy around business analytics.

In Mark Rittman’s session he talked a bit abut Oracle’s strategy around business analytics

Always a good idea to get Mark’s take on things BI.

Lastly (before the general session that is), I did my second presentation along with Stewart Bryson. We introduced folks to the idea of using OBIEE on top of a Data Vault Data Warehouse and showed how it conformed to Oracle’s reference architecture while at the same time enabled an agile approach to BI.

Oracle ACE Stewart Bryson talks about how he used OBIEE to create a virtual data mart on top of a Data Vault style EDW model

Oracle ACE Stewart Bryson talks about how he used OBIEE to create a virtual data mart on top of a Data Vault style EDW model

I can’t thank Stewart enough for taking on the challenge to learn Data Vault and figuring out how to use it effectively in OBIEE. His approach works very well and should really enable organizations to truly leverage their data and create an agile BI/DW framework.

That’s it for today’s report. I should have another report for you tomorrow on activities today!

Cheers.

Kent

P.S. Yes there was eating and drinking around the French Quarter after hours. Even got to have a drink with Doc Hendley and his wife on Bourbon Street. That was a nice treat.

See you at KScope13!

Are you ready?

It is almost time for the annual ODTUG KScope conference in New Orleans. It starts with the Community Service Day on Saturday June 22nd and runs through Thursday June 27th at the Sheraton Hotel right on the edge of the French Quarter.

For my readers that are attending, I will be giving three talks this year, leading morning Chi Gung classes,  as well as sitting on the BI Lunch and Learn Panel.

My talks will be:

Five Ways to Make Data Modeling Fun – Monday at 9:45 AM

Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach – Monday at 3 PM (with Stewart Bryson)

Top Ten Cool Features in Oracle SQL Developer Data Modeler – Tuesday at 12:15 PM

And on Wednesday at 10:45 AM you will find me in Social Media Lounge getting interviewed about Data Modeling, ODTUG, and KScope.

If you are joining me for Morning Chi Gung, I believe we will meet in the hotel lobby at 6:45 AM so we can walk to the river front park where we will hold our class. It is only a 30 minute class right before breakfast so please give it a try and get energized for a long day of learning and networking! Follow me on twitter @KentGraziano for any updates to the location and meeting time.

Don’t forget to download the new KScope Mobile App so you can keep track of your schedule and not miss any of these sessions.

See you in New Orleans!

Kent

The Oracle Data Warrior

Free Introduction to Data Warehousing the Data Vault Way

My good friend Dan (@dlinstedt) has put together a sweet set of three videos to introduce everyone to the wonderful world of Data Vault.

When you sign up you will get a set of email messages from Dan discussing the Data Vault Approach to data warehousing. You get access to three videos about the architecture, the methodology, and the modeling technique.

Plus you get free downloads of the first few chapters of the data vault book Super Charge Your Data Warehouse.

So if you have always wanted to learn more about Data Vault, but don’t have the budget for a full on class, this offer will get you headed in the right direction.

Head on over to the Learn Data Vault site now. It is all free.

(NB: this is an affiliate link. If you eventually buy the book or some other training off Dan’s site, I will get a small piece of the action. Not enough to retire mind you, but it might buy lunch.)

The videos are pretty cool. Enjoy.

Kent

Post Navigation