The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “#datavault”

East Coast Oracle Users Conference (#ECOracle13) Review

This week I did a little travel and went to Durham, North Carolina to present at the 2013 East Coast Oracle Users Conference (aka ECO). While I have been aware of this event for over 20 years, it is the first time I have attended.

It was worth the trip. (Thanks to Jeff Smith at Oracle for alerting me to the event and encouraging me to submit). He actually sent me, Danny and Sarah (The EPM Queen). It was great to have members of the ODTUG clan together.

The gang of three - ODTUGers at ECO13 thanks to That Jeff Smith guy. Yea - he sent us!

The gang of three – ODTUGers at ECO13 thanks to That Jeff Smith guy. Yea – he sent us!

Overall a well run event held at the Sheraton Imperial Hotel and Conference Center. It drew over 300 attendees and a large list of Oracle ACE and ACE Directors were there to present to a crowd very eager to learn and network.

Fun and Games: The Keynote

Our opening keynote from Steven Feuerstein (inventor of the PL/SQL Challenge)  was a fun take on different types of therapy and how they might be applied to software developers.

PL/SQL Evangelist Steven Feuerstein discusses Coding Therapy for Software Developers

PL/SQL Evangelist Steven Feuerstein discusses Coding Therapy for Software Developers

His discussed the use of:

  • Game therapy (try out mastermind or setgame.com)
  • Dream Therapy
  • Confessional Therapy
  • Shock Therapy
  • Couples Therapy
    • For DBA & Developers
    • For Developers & Their Managers

It was a fun, light way to start the conference with some very valuable advice.

Heavy Duty DBA-type Tuning Talks

Oracle ACE Director, author, and trainer Craig Shallahamer did two deep dive tuning sessions that I attended. In the first one, Introduction to Time-based Performance Analysis: Stop the Guessing, Craig gave us his four point framework for Holistic Performance Analysis. The points were:

  1. The Three Circles to consider (OS, Database, Application)
  2. Be Quantitative (i.e., trust the numbers not a hunch)
  3. Serialization is death, Parallel is life
  4. Tell a story (make the explanation of the issue understandable to managers)

With that he got into all sorts of v$ view stuff that went mostly over my head. Needless to say I will have to download the slides from his site (orapub.com) and give them to someone more attuned to this kind of tuning than I!

Oracle ACE Director, Craig Shallahamer discusses low level details for understanding Oracle CPU consumption

Oracle ACE Director, Craig Shallahamer discusses low level details for understanding Oracle CPU consumption

The second presentation Craig gave was called Understanding Oracle CPU Consumption: The Missing Link. Again lots of views and some Linux OS utilities (e.g., perf) and lots of numbers were displayed and discussed to try to ferret out how to determine what Oracle functions were actually taking up CPU time.

Even though I don’t really understand a lot of this (hey, I am a data modeler, not a dba right?) I like to go to sessions like this as I enjoy listening to smart people talk passionately about the things they do, and I figure I might retain just enough to point someone else in the right direction in the future, even if it is only to give them a copy of these slides!

Lovely Southern Style Lunch

ECO had one of the nicest little lunch buffets I have eaten in a while. Very simple southern food that included cole slaw, potato salad, baked chicken, fried chicken, pulled port (with N. Carolina bbq sauce), hush puppies and apple cobbler. (I did not say it was a light lunch right?)

I love all kinds of BBQ and the pulled pork did not disappoint. I do not usually like fried chicken but figured I should try it and was pleasantly surprised. Crisp and moist. Very nice.

Traditional Southern Fare for Lunch

Traditional Southern Fare for Lunch

My 1st Session – Making Data Modeling Fun

I had the best turnout ever for this topic with over 40 people in the session most of whom were game to try my gamification of data model review sessions.

Session attendees developing Haiku poems based on a Data Model

Session attendees developing Haiku poems based on a Data Model

One of the tasks was to translate relationship sentences and model descriptions into Haiku (or another form). There were prizes as an incentive to play along.

Some of the prizes for participants at my talk

Some of the prizes for participants at my talk

The winner by general acclamation was Edie Waite from Raleigh, NC with this little limerick:

There once was a country named France
Which had many regions for dance
The locations they chose to dance on their toes
Made employees all look askance.

The data model we used had the entities: Country, Region, Employee, Locations, and a few others.

Another Haiku from Sarah Zumbrum (a noted non-data modeler) went like this:

More than one region
Can reside in a country
Like the USA
The session was really a lot of fun thanks to everyone being open minded and being willing to try some unconventional approaches to gathering data model requirements. (There was one other Haiku in French which I will add as soon as the author sends it to me!)

ECO 13 – Day 2

Keynote today was about eBusiness suite stuff. I sat there after breakfast mostly not listening as I started to put this blog post together.

Then I did my 2nd talk.

Agile Data Warehouse Modeling

I had a somewhat disappointing turnout (only 5 people, sigh) but it was a great exchange with those 5 people. We had a very good discussion about applying agile techniques to building a data warehouse and I was able to introduce them to some of the details of Data Vault Data Modeling. None of them knew much about data vault, but some had heard the term.

One attendee did tell me he was skeptical about the approach when he came in as he was a traditional Kimball dimensional data warehouse guy. But after the session he was willing to concede there was some merit and ideas he had not seen before and he was going to take those into consideration as he embarked on a new phase of his project where there were some complex problems to solve. He could see that data vault might just help.

Really can’t ask for more than that!

Embedded Analytics

So my last session for the event was to attend Craig Warman’s talk on embedded analytics. It was a good discussion about how BI and analytics have evolved, Craig presented a simple maturity model as part of the talk:

Level 0: BI reporting and analytic applications are completely seperate from other applications
Level 1: Gateway Analytics – Operational applications have a report tab or menu item to launch the BI reporting tool interface. Maybe there is a login pass through.
Level 2: Inline Analytics – at this level, the analytics and BI tool has been incorporated into the operational application interface to the point it has the same look and feel and you can’t tell it is a separate product or tool. This where many organizations are today.
Level 3: Infused Analytics – this is the goal. At this level the analytics are truly part of the application and provide core functionality. Examples of this are the recommendations you get on Amazon as you check out or the movie suggestions you get on Netflix based on your prior movie choices. If the analytic pieces were removed the application would not function correctly.
Craig Warman (ECO13 conference chair) talks about what embedded analytics is (and is not)

Craig Warman (ECO13 conference chair) talks about what embedded analytics is (and is not)

Well that’s it for this conference.

Put ECO on your radar for 2014.

See you around.

Kent

P.S. Next conference on my agenda is RMOUG TD 2014. Let me know if you will be there.

Agile Data Warehouse Modeling: How to Build a Virtual Type 2 Slowly Changing Dimension

One of the ongoing complaints about many data warehouse projects is that they take too long to delivery. This is one of the main reasons that many of us have tried to adopt methods and techniques (like SCRUM) from the agile software world to improve our ability to deliver data warehouse components more quickly.

So, what activity takes the bulk of development time in a data warehouse project?

Writing (and testing) the ETL code to move and transform the data can take up to 80% of the project resources and time.

So if we can eliminate, or at least curtail, some of the ETL work, we can deliver useful data to the end user faster.

One way to do that would be to virtualize the data marts.

For several years Dan Linstedt and I have discussed the idea of building virtual data marts on top of a Data Vault modeled EDW.

In the last few years I have floated the idea among the Oracle community. Fellow Oracle ACE Stewart Bryson and I even created a presentation this year (for #RMOUG and #KScope13) on how to do this using the Business Model (meta-layer) in OBIEE (It worked great!).

While doing this with a BI tool is one approach, I like to be able to prototype the solution first using Oracle views (that I build in SQL Developer Data Modeler of course).

The approach to modeling a Type 1 SCD this way is very straight forward.

How to do this easily for a Type 2 SCD has evaded me for years, until now.

Building a Virtual Type 2 SCD (VSCD2)

So how to create a virtual type 2 dimension (that is “Kimball compliant” ) on a Data Vault when you have multiple Satellites on one Hub?

(NOTE: the next part assumes you understand Data Vault Data Modeling. if you don’t, start by reading my free white paper, but better still go buy the Data Vault book on LearnDataVault.com)

Here is how:

Build an insert only PIT (Point-in-Time) table that keeps history. This is sometimes referred to as a historicized PIT tables.  (see the Super Charge book for an explanation of the types of PIT tables)

Add a surrogate Primary Key (PK) to the table. The PK of the PIT table will then serve as the PK for the virtual dimension. This meets the standard for classical star schema design to have a surrogate key on Type 2 SCDs.

To build the VSCD2 you now simply create a view that uses the PIT table to join the Hub and all the Satellites together. Here is an example:

Create view Dim2_Customer (Customer_key, Customer_Number, Customer_Name, Customer_Address, Load_DTS)
as
Select sat_pit.pit_seq, hub.customer_num, sat_1.name, sat_2.address, sat_pit.load_dts
from HUB_CUST hub,        
          SAT_CUST_PIT sat_pit,        
          SAT_CUST_NAME sat_1,        
          SAT_CUST_ADDR sat_2
where  hub.CSID = sat_pit.CSID           
    and hub.CSID = sat_1.CSID           
    and hub.CSID = sat_2.CSID           
    and sat_pit.NAME_LOAD_DTS = sat_1.LOAD_DTS           
    and sat_pit.ADDRESS_LOAD_DTS = sat_2.LOAD_DTS 
 

Benefits of a VSCD2

  1. We can now rapidly demonstrate the contents of a type 2 dim prior to ETL programming
  2. With using PIT tables we don’t need the Load End DTS on the Sats so the Sats become insert only as well (simpler loads, no update pass required)
  3. Another by product is the Sat is now also Hadoop compliant (again insert only)
  4. Since the nullable Load End DTS is not needed, you can now more easily partition the Sat table by Hub Id and Load DTS.

Objections

The main objection to this approach is that the virtual dimension will perform very poorly. While this may be true for very high volumes, or on poorly tuned or resourced databases, I maintain that with today’s evolving hardware appliances  (e.g., Exadata, Exalogic) and the advent of in memory databases, these concerns will soon be a thing of the past.

UPDATE 26-May-2018  – Now 5 years later I have successfully done the above on Oracle. But now we also have Snowflake elastic cloud data warehouse where all the prior constraints are indeed eliminated. With Snowflake you can now easily chose to instantly add compute power if the view is too slow or do the work and processing to materialize the view. (end update)

Worst case, after you have validated the data with your users, you can always turn it into a materialized view or a physical table if you must.

So what do you think? Have you ever tried something like this? Let me know in the comments.

Get virtual, get agile!

Kent

The Data Warrior

P.S. I am giving a talk on Agile Data Warehouse Modeling at the East Coast Oracle Conference this week. If you are there, look me up and we can discuss this post in person!

Let’s Review #OOW13 and #OTW13 in Pictures

Yes I have been derelict in my duty and not posted about the sessions I attended at Oracle OpenWorld (#OOW13) and OakTable World (#OTW).

Well here are the high points with pictures!

Monday

Monday started off with the now annual Swim the Bay (so I missed the keynote). If you have Facebook, you can see pictures from the event here.

Most of the day I then spent at the alternate conference, OakTable World (#OTW13) seeing a few talk and giving one myself.

My good friend from Denver, Tim Gorman gave a nice talk about all the data compression options available in Oracle.

Tim Gorman: Oracle Compression Options

Tim Gorman: Oracle Compression Options

Next was a great session from the well known blogger and author Fabian Pascal. I have been reading his work for years but this was the first time I got to hear him speak in person. As with his writing, the talk was both intellectually stimulating and challenging!

Fabian Pascal: The Last Null

Fabian Pascal: The Last Null

It really is quite a debate in the database world about the meaning and use of NULL in an RDBMS. Fabian has a proposal on how we can (and should) represent data in a way where there will never be NULL attributes.

After a some scheduling issues. later in the day, I did my presentation on using Data Vault Modeling for Agile Data Warehouse Modeling. The room I got had a huge wall for me to project my session on. Definitely the biggest screen ever for one of my talks.

Biggest screen ever for me and my data vault presentation.

Biggest screen ever for me and my data vault presentation.

Tuesday

Started the morning with a few friends doing morning Chi Gung in Union Square, then followed by getting a quick survey of the exhibit hall in Moscone South and a trip to the Demo grounds.

The throng descends into the depths of Moscone West to hunt the exhibit hall for goodies.

The throng descends into the depths of Moscone West to hunt the exhibit hall for goodies.

The hall was of course HUGE as usual so some of the vendors who were tucked in back got creative on getting the foot traffic to come their way.

A clever gimmick one vendor did to get traffic to their booth in the gigantic hall

A clever gimmick one vendor did to get traffic to their booth in the gigantic hall

For sessions, I attend a road map session on Oracle’s Big Data strategy given by my friend JP Dijcks.

JP talks all things Big Data

JP talks all things Big Data

Mostly he painted a picture of the issues with figuring out how to collect and put all that data to real work. Of course Oracle has a ton of products to offer to help solve the problem.

How to shrink the gap between getting big data and actually using it!

How to shrink the gap between getting big data and actually using it!

Next up I attended Jeff Smith’s session on SQL Developer 4.0 and got to learn that there was a data mining extension available for the tool that makes doing some advanced analytics a lot easier.

Definition for Data Mining. An extension for Data Mining is available for SQL Developer.

Definition for Data Mining. An extension for Data Mining is available for SQL Developer.

Next on my agenda was the Cloud keynote with Microsoft. I wrote about that here.

Finally for the day, a late presentation by Maria Colgan and Jonathan Lewis giving us their top tuning tips in what they called the SQL Tuning Bootcamp.

Optimizer tips from a pro Jonathan Lewis. I am sure it means something to someone out there!

Optimizer tips from a pro Jonathan Lewis. I am sure it means something to someone out there!

As always with these type session, there was a ton of useful information that makes my brain hurt. I have to keep reviewing  my notes to make sure I can use at least 10% of what they taught.

Wednesday

This was mostly a work day for me at a client site. And a late lunch to see the final race of the America’s Cup.

In case you have been under a rock since last week, Team USA won! It was great to actually be there on Pier 27 during the final race. Not a great vantage point overall but with the big screen to watch and then seeing the boats right after they finished, it was worth the walk.

After the race and a little more data model work at my client’s office, I walked back to the conference to see a final session (for me) given by Gwen Shapira about using solid state disks with Exadata.

I really did not know much about SSDs before this session but feel really educated now. I actually had no idea that SSD and FLASH drives or FLASH memory were the same thing. Guess I was behind on the hardware buzzwords.

Gwen and Mark on Solid State Disk AKA Flash

Gwen and Mark on Solid State Disk AKA Flash

Then it was off to the annual blogger meetup then dinner on the town with friends at The Stinking Rose (thanks Tim!).

I decided to skip the appreciation event this year and take it easy, have a nice dinner, then pack up to head home. Thursday it was breakfast at Lori’s Diner then off to the airport and back home.

As a reminder if you want to see what the buzz was at the events, just check out the hashtags #OOW13 and #OTW13 on twitter (if you had a big data machine you might even be able to generate some insight from those feeds).

Well that’s a wrap for this years big show.

Next up, I will be speaking at the upcoming ECO conference in North Carolina. Should be fun.

Later.

Kent

P.S. If you want to see my OTW presentation, you can find them on Slideshare.

P.P.S. For another great review of OOW13 check out this post by my friend from Turkey, Gurcan. See if you can find my unlabeled cameo in the post.

KScope13 Day Two: Wine to Water and Other Transformations

So day two in New Orleans at the ODTUG KScope13 event was another big day.

I am gong to start out at the end of the day with the General Session update so if you don’t have time to read the whole post you can read the really important and interesting stuff first.

General Session and Keynote

First the fun part, we got greeted by a live New Orleans Jazz band.

We had a live band in the lobby to great attendees before the general session and keynote.

We had a live band in the lobby to greet attendees before the general session and keynote.

That was great fun. They then led us all into the grand ballroom for the general session and then went out and led in our board of directors and the conference committee all dancing up a storm in true NOLA fashion.

The general session gets opened with the board and conference committee being lead on stage marching/dancing behind a live New Orleans Marching Jazz Band

The general session gets opened with the board and conference committee being led on stage marching/dancing behind a live New Orleans Marching Jazz Band

ODTUG Announcements and Award Winners

Every year ODTUG gives out a number of awards so I want to recognize the winners here:

Editors Choice Award for Best White Paper went to David Schleis.

The Oracle Contributor of the Year (which goes to an Oracle Corp employee) went to my good buddy, Jeff Smith.

The ODTUG Volunteer Award went to Mack McCasland who has been working behind the scenes at our events for over 10 years (and he is retired!),

In addition to these awards, Oracle also announced the promotion of my good friend John King to the status of Oracle ACE Director.

The big announcement: KScope14 will be in Seattle, Washington, USA on June 22-26, 2014. The conference hotel will be the Sheraton in downtown Seattle.

Wine to Water

The big deal for the night was our keynote speaker, Doc Hendley. He is a bartender who decided he wanted to make a much bigger impact on the world and ended up founding an organization that now brings clean water to people in over 15 countries.

The statistics he gave us on how many people in the world do not have clean, safe water to drink (over 1 billion!) were stunning. And that more people die from lack of clean water than those that have died in all the recent wars put together. Another startling fact is that even though it is the biggest killer on the planet, dealing with dirty water for the poor of the world gets less than 20% of the funding when compared with funding for HIV, malaria, and TB.

He has a very moving and passionate story about how he got to that place in his life where he found his real purpose, discovered these facts, and set out to do something about it. His talk (and book) tell the whole story. There were a few teary eyes by the end of his talk. Doc has shown amazing courage and perseverance in the pursuit of making a difference.

He really has proven that one, very ordinary person can have a large impact on the lives of others if you really set your mind to it.

I encourage you to go over to his site and learn about his mission, his story, and his organization Wine to Water.

Maybe you can help him make a difference.

Doc Hendley, Founder of Wine to Water, gives a moving and inspirational address as our keynote speaker.

Doc Hendley, Founder of Wine to Water, gives a moving and inspirational address as our keynote speaker.

The rest of my day

So back to earlier it the day (for those still with me here)…

Started as always with my morning chi gung class. The group grew to about 14 people with a few new folks joining us. We had a few passersby stop to watch and try a few moves as well.

After a healthy breakfast and a shower I did my first talk of the event, Five Ways to Make Data Modeling Fun. There were about 20 folks in the session and we all had a good time trying out my ideas.

Then I headed over to hear my friend Jeff Smith talk about SQL Developer (my 2nd favorite Oracle product).

Oracle Senior Product Manager (and ODTUG Oracle Contributor of the Year) shows us his top tips and trick for SQL Developer.

Oracle Senior Product Manager (and ODTUG Oracle Contributor of the Year) shows us his top tips and tricks for SQL Developer

After that is was another awesome lunch (beet salad and redfish!) then on to Mark Rittman’s session about OBIEE, Endeca, and his take on the overall landscape of Oracle BI and data discovery in the new world of NoSQL and Hadoop.

In Mark Rittman's session he talked a bit abut Oracle's strategy around business analytics.

In Mark Rittman’s session he talked a bit abut Oracle’s strategy around business analytics

Always a good idea to get Mark’s take on things BI.

Lastly (before the general session that is), I did my second presentation along with Stewart Bryson. We introduced folks to the idea of using OBIEE on top of a Data Vault Data Warehouse and showed how it conformed to Oracle’s reference architecture while at the same time enabled an agile approach to BI.

Oracle ACE Stewart Bryson talks about how he used OBIEE to create a virtual data mart on top of a Data Vault style EDW model

Oracle ACE Stewart Bryson talks about how he used OBIEE to create a virtual data mart on top of a Data Vault style EDW model

I can’t thank Stewart enough for taking on the challenge to learn Data Vault and figuring out how to use it effectively in OBIEE. His approach works very well and should really enable organizations to truly leverage their data and create an agile BI/DW framework.

That’s it for today’s report. I should have another report for you tomorrow on activities today!

Cheers.

Kent

P.S. Yes there was eating and drinking around the French Quarter after hours. Even got to have a drink with Doc Hendley and his wife on Bourbon Street. That was a nice treat.

See you at KScope13!

Are you ready?

It is almost time for the annual ODTUG KScope conference in New Orleans. It starts with the Community Service Day on Saturday June 22nd and runs through Thursday June 27th at the Sheraton Hotel right on the edge of the French Quarter.

For my readers that are attending, I will be giving three talks this year, leading morning Chi Gung classes,  as well as sitting on the BI Lunch and Learn Panel.

My talks will be:

Five Ways to Make Data Modeling Fun – Monday at 9:45 AM

Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach – Monday at 3 PM (with Stewart Bryson)

Top Ten Cool Features in Oracle SQL Developer Data Modeler – Tuesday at 12:15 PM

And on Wednesday at 10:45 AM you will find me in Social Media Lounge getting interviewed about Data Modeling, ODTUG, and KScope.

If you are joining me for Morning Chi Gung, I believe we will meet in the hotel lobby at 6:45 AM so we can walk to the river front park where we will hold our class. It is only a 30 minute class right before breakfast so please give it a try and get energized for a long day of learning and networking! Follow me on twitter @KentGraziano for any updates to the location and meeting time.

Don’t forget to download the new KScope Mobile App so you can keep track of your schedule and not miss any of these sessions.

See you in New Orleans!

Kent

The Oracle Data Warrior

Post Navigation