The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “agile”

Better Data Modeling: The Data Warrior Speaks 2015

Great news, I have confirmed three major events, and one local event so far this year where you can come out and hear me speak about some of my favorite topics: #DataModeling, #SQLDevModeler, and #DataVault.

So, line up your training budget and get registered for at least one of these great events.

DAMA Houston

My first talk for the year will be local – downtown Houston. I will present an Introduction to Data Vault Modeling for the Houston Chapter of DAMA International (next week!).

When: 10-Feb-2015
Time: 1pm – 4:30pm
Where: Chevron Building, Rio Grande Room – 51st floor, 1600 Smith, Houston, TX 77002

If you plan to attend, please RSVP directly to stephen.pace@kalido.com.

RMOUG Training Days 2015

Held every year at the Denver Convention Center in mid-February, The Rocky Mountain Oracle Users Group Training Days is the best value around for user group events – low cost ($395- $455), great locations (Denver!), and excellent speaker lineup (international speakers, Oracle ACE and ACE Directors).

I will be speaking both Wednesday, February 18, and Thursday, February 19 (last session!). My topics this year will be an Introduction to SQL Developer Data Modeler, and Worst Practices in Data Warehouse Design.

Plus I will be leading Morning Chi Gung exercises at 7 AM both days to get you all warmed up for a great day of learning. Check the entire agenda here.

As a bonus there are some excellent deep dive sessions on Tuesday, February 17th that are not to be missed, so get there early.

New this year will be Special Interest Group (SIG) meetings during lunch on Wednesday. I will be co-leading one on Data Integration & Data Warehousing with Bobby Curtis.

So, lots to do see and learn. Sign up today (and bring your ski equipment for the weekend after).

2nd Annual World Wide Data Vault Consortium

WWDVC was so successful last year, that Dan decided to do it again. This year there is even a new cool website for the event, which will be held May 28-30 in Stowe, Vermont at the Trapp Family Lodge. This will be a small event (less than 60 people), with a single track so you won’t have to decide which talk to attend.

Yes, the hills will be alive with the sounds of Data Vault geeks from around the world telling their tales of trials, tribulations, and success as they try to implement large, agile, enterprise data warehouse programs across many industries. Topics include:

  • Big Data, NoSQL
  • Virtualization of Data Marts
  • Data Vault 2.0 & Agility
  • Changing roles of Data Modeling
  • Managed Self-Service BI

The speaker lineup is a who’s-who of the data warehouse and agile world.

Special guests this year include a keynote from Claudia Imhoff, Dan Linstedt,  and newest addition Scott Ambler (one of the authors of the Agile Manifesto).

I will be there again, giving two talks with my buddy from McKesson, Keith Hoyle. We will discuss Data Warehousing in the Real World and talk about our endeavors to develop Virtualized Hybrid Type 1-2 Dimensions to enable Extreme BI.

Don’t miss this chance to rub elbows and network with the top innovators and thinkers in the data warehouse and BI space. Sign up soon as there are limited slots and limited rooms at the inn.

ODTUG KScope15

Another amazing annual event, this user group gathering will be a veritable who’s-who in the Oracle community. Again you will find Oracle ACEs and ACE Directors, as well as Oracle Product Managers, all ready and willing to discuss the latest and greatest tools for doing Oracle development work. Check out the amazing list of talks and presenters.

This year it is back to the beach for KScope. It will be held June 21-25 at the Diplomat Resort and Spa on the beach in Hollywood, Florida.

By popular demand, the last day of the conference will be all Deep Dive sessions, so be sure to plan your travel to hang out until the end (and then enjoy the beach!).

I will be giving two talks during the week (same ones as at RMOUG), answering questions on a panel or two, and again running my annual Morning Chi Gung sessions every morning (but this year outside on the beach).

This should be a very educational and relaxing event as it is every year. And it is in a family-friendly location so bring the gang along.  You can register today and still get a huge early registration discount.

So what are you  waiting for?

See you soon!

Kent

The Data Warrior

P.S. While at these events I do expect to have some limited free time, so if you would like some one-on-one coaching in person, contact me directly at kent <dot> graziano <at> att <dot> net to set up a session.

 

Better Data Modeling: #DataVault 2.0 Virtualising your Data Vault – Satellites

Guest Blog Link: Virtualising your Data Vault – Satellites

This is a MUST READ for anyone wanting to get Agile in their DW/BI program and for anyone doing Data Vault 2.0.

Actually anyone doing Data Vault 1.0 can benefit from the technique as well.

Roelant Vos has done quite a bit on trying to virtualize and automate data warehouses using DV 2.0 (especially since the WWDVC in Vermont.). Please check out his blog and follow him on twitter too.

Cheers

Kent

The Oracle Data Warrior

Better Data Modeling: Color Code Your Data Model Diagrams using #SQLDevModeler

One of the standards I recommend in my book  Check List for Doing Data Model Design Reviews is to use color in your diagrams to visually differentiate types of entities or tables.

As luck would have it, Oracle SQL Developer Data Modeler has a feature that makes this very easy. It is Classification Types.

In the latest version. 4.0, you set these up by going to the context menu on the Design level. From that menu pick Properties. Once on the property dialog go to Settings -> Diagram -> Classification Types. (In 3.x look under Tools -> Preferences)

The default install comes with a bunch already – fact, dimensions, logging, summary, and temporary. Each has a pre-set color assigned. You can change that color by clicking on the color and selecting another option from the pallet. You can also set a prefix for each type. (Note – if you are already using a classification and change the color, when you hit apply the new color will be applied in all existing diagrams within the design.)

You add new types by clicking the green plus (+) sign and then just add in whatever you want and save.

For Data Vault modeling, I add three new types: Hub, Link, and Satellite with the colors you see in the screen shot here.

Using Classification Types to Color Code Your Diagrams

Using Classification Types to Color Code Your Diagrams

To apply a classification type to an existing table, open the table property dialog and look for the classification types node in the tree (in 4.0). In 3.x, there is a simple classification type drop down on the main property page.

Once applied, the first letter of the classification type will appear in the upper left corner of the table (see screen shot).

Another way I have used this recently was in my current data warehouse project where I have source, stage, and dimensional tables all in one design. I found I often want to show all three tiers in on diagram (sub view) for a sprint (we are using a SCRUM approach) to help the ETL programmers and QA folks have one place to go where they can see how these layers are related. So for this project, I also added a source and stage classification type.

So if you have been color coding you diagrams by hand, this tip should save you a bunch of time since you won’t have to pick the colors by hand on each table. Plus the color selection will be more consistent.

If you aren’t color coding, now would be a great time to start!

Bonus Tip: If, like me, you want to be consistent across all your designs with the types and colors, I just figured out I can hack the dl_settings.xml file to copy my classification type customizations from one design to another. Just be sure to exit and then restart SDDM after you update the file for it to take effect.

Have fun coloring your diagram! (Maybe more people will read them)

Kent

The Oracle Data Warrior

 

Agile Data Warehouse Modeling: How to Build a Virtual Type 2 Slowly Changing Dimension

One of the ongoing complaints about many data warehouse projects is that they take too long to delivery. This is one of the main reasons that many of us have tried to adopt methods and techniques (like SCRUM) from the agile software world to improve our ability to deliver data warehouse components more quickly.

So, what activity takes the bulk of development time in a data warehouse project?

Writing (and testing) the ETL code to move and transform the data can take up to 80% of the project resources and time.

So if we can eliminate, or at least curtail, some of the ETL work, we can deliver useful data to the end user faster.

One way to do that would be to virtualize the data marts.

For several years Dan Linstedt and I have discussed the idea of building virtual data marts on top of a Data Vault modeled EDW.

In the last few years I have floated the idea among the Oracle community. Fellow Oracle ACE Stewart Bryson and I even created a presentation this year (for #RMOUG and #KScope13) on how to do this using the Business Model (meta-layer) in OBIEE (It worked great!).

While doing this with a BI tool is one approach, I like to be able to prototype the solution first using Oracle views (that I build in SQL Developer Data Modeler of course).

The approach to modeling a Type 1 SCD this way is very straight forward.

How to do this easily for a Type 2 SCD has evaded me for years, until now.

Building a Virtual Type 2 SCD (VSCD2)

So how to create a virtual type 2 dimension (that is “Kimball compliant” ) on a Data Vault when you have multiple Satellites on one Hub?

(NOTE: the next part assumes you understand Data Vault Data Modeling. if you don’t, start by reading my free white paper, but better still go buy the Data Vault book on LearnDataVault.com)

Here is how:

Build an insert only PIT (Point-in-Time) table that keeps history. This is sometimes referred to as a historicized PIT tables.  (see the Super Charge book for an explanation of the types of PIT tables)

Add a surrogate Primary Key (PK) to the table. The PK of the PIT table will then serve as the PK for the virtual dimension. This meets the standard for classical star schema design to have a surrogate key on Type 2 SCDs.

To build the VSCD2 you now simply create a view that uses the PIT table to join the Hub and all the Satellites together. Here is an example:

Create view Dim2_Customer (Customer_key, Customer_Number, Customer_Name, Customer_Address, Load_DTS)
as
Select sat_pit.pit_seq, hub.customer_num, sat_1.name, sat_2.address, sat_pit.load_dts
from HUB_CUST hub,        
          SAT_CUST_PIT sat_pit,        
          SAT_CUST_NAME sat_1,        
          SAT_CUST_ADDR sat_2
where  hub.CSID = sat_pit.CSID           
    and hub.CSID = sat_1.CSID           
    and hub.CSID = sat_2.CSID           
    and sat_pit.NAME_LOAD_DTS = sat_1.LOAD_DTS           
    and sat_pit.ADDRESS_LOAD_DTS = sat_2.LOAD_DTS 
 

Benefits of a VSCD2

  1. We can now rapidly demonstrate the contents of a type 2 dim prior to ETL programming
  2. With using PIT tables we don’t need the Load End DTS on the Sats so the Sats become insert only as well (simpler loads, no update pass required)
  3. Another by product is the Sat is now also Hadoop compliant (again insert only)
  4. Since the nullable Load End DTS is not needed, you can now more easily partition the Sat table by Hub Id and Load DTS.

Objections

The main objection to this approach is that the virtual dimension will perform very poorly. While this may be true for very high volumes, or on poorly tuned or resourced databases, I maintain that with today’s evolving hardware appliances  (e.g., Exadata, Exalogic) and the advent of in memory databases, these concerns will soon be a thing of the past.

UPDATE 26-May-2018  – Now 5 years later I have successfully done the above on Oracle. But now we also have Snowflake elastic cloud data warehouse where all the prior constraints are indeed eliminated. With Snowflake you can now easily chose to instantly add compute power if the view is too slow or do the work and processing to materialize the view. (end update)

Worst case, after you have validated the data with your users, you can always turn it into a materialized view or a physical table if you must.

So what do you think? Have you ever tried something like this? Let me know in the comments.

Get virtual, get agile!

Kent

The Data Warrior

P.S. I am giving a talk on Agile Data Warehouse Modeling at the East Coast Oracle Conference this week. If you are there, look me up and we can discuss this post in person!

KScope13 Day Two: Wine to Water and Other Transformations

So day two in New Orleans at the ODTUG KScope13 event was another big day.

I am gong to start out at the end of the day with the General Session update so if you don’t have time to read the whole post you can read the really important and interesting stuff first.

General Session and Keynote

First the fun part, we got greeted by a live New Orleans Jazz band.

We had a live band in the lobby to great attendees before the general session and keynote.

We had a live band in the lobby to greet attendees before the general session and keynote.

That was great fun. They then led us all into the grand ballroom for the general session and then went out and led in our board of directors and the conference committee all dancing up a storm in true NOLA fashion.

The general session gets opened with the board and conference committee being lead on stage marching/dancing behind a live New Orleans Marching Jazz Band

The general session gets opened with the board and conference committee being led on stage marching/dancing behind a live New Orleans Marching Jazz Band

ODTUG Announcements and Award Winners

Every year ODTUG gives out a number of awards so I want to recognize the winners here:

Editors Choice Award for Best White Paper went to David Schleis.

The Oracle Contributor of the Year (which goes to an Oracle Corp employee) went to my good buddy, Jeff Smith.

The ODTUG Volunteer Award went to Mack McCasland who has been working behind the scenes at our events for over 10 years (and he is retired!),

In addition to these awards, Oracle also announced the promotion of my good friend John King to the status of Oracle ACE Director.

The big announcement: KScope14 will be in Seattle, Washington, USA on June 22-26, 2014. The conference hotel will be the Sheraton in downtown Seattle.

Wine to Water

The big deal for the night was our keynote speaker, Doc Hendley. He is a bartender who decided he wanted to make a much bigger impact on the world and ended up founding an organization that now brings clean water to people in over 15 countries.

The statistics he gave us on how many people in the world do not have clean, safe water to drink (over 1 billion!) were stunning. And that more people die from lack of clean water than those that have died in all the recent wars put together. Another startling fact is that even though it is the biggest killer on the planet, dealing with dirty water for the poor of the world gets less than 20% of the funding when compared with funding for HIV, malaria, and TB.

He has a very moving and passionate story about how he got to that place in his life where he found his real purpose, discovered these facts, and set out to do something about it. His talk (and book) tell the whole story. There were a few teary eyes by the end of his talk. Doc has shown amazing courage and perseverance in the pursuit of making a difference.

He really has proven that one, very ordinary person can have a large impact on the lives of others if you really set your mind to it.

I encourage you to go over to his site and learn about his mission, his story, and his organization Wine to Water.

Maybe you can help him make a difference.

Doc Hendley, Founder of Wine to Water, gives a moving and inspirational address as our keynote speaker.

Doc Hendley, Founder of Wine to Water, gives a moving and inspirational address as our keynote speaker.

The rest of my day

So back to earlier it the day (for those still with me here)…

Started as always with my morning chi gung class. The group grew to about 14 people with a few new folks joining us. We had a few passersby stop to watch and try a few moves as well.

After a healthy breakfast and a shower I did my first talk of the event, Five Ways to Make Data Modeling Fun. There were about 20 folks in the session and we all had a good time trying out my ideas.

Then I headed over to hear my friend Jeff Smith talk about SQL Developer (my 2nd favorite Oracle product).

Oracle Senior Product Manager (and ODTUG Oracle Contributor of the Year) shows us his top tips and trick for SQL Developer.

Oracle Senior Product Manager (and ODTUG Oracle Contributor of the Year) shows us his top tips and tricks for SQL Developer

After that is was another awesome lunch (beet salad and redfish!) then on to Mark Rittman’s session about OBIEE, Endeca, and his take on the overall landscape of Oracle BI and data discovery in the new world of NoSQL and Hadoop.

In Mark Rittman's session he talked a bit abut Oracle's strategy around business analytics.

In Mark Rittman’s session he talked a bit abut Oracle’s strategy around business analytics

Always a good idea to get Mark’s take on things BI.

Lastly (before the general session that is), I did my second presentation along with Stewart Bryson. We introduced folks to the idea of using OBIEE on top of a Data Vault Data Warehouse and showed how it conformed to Oracle’s reference architecture while at the same time enabled an agile approach to BI.

Oracle ACE Stewart Bryson talks about how he used OBIEE to create a virtual data mart on top of a Data Vault style EDW model

Oracle ACE Stewart Bryson talks about how he used OBIEE to create a virtual data mart on top of a Data Vault style EDW model

I can’t thank Stewart enough for taking on the challenge to learn Data Vault and figuring out how to use it effectively in OBIEE. His approach works very well and should really enable organizations to truly leverage their data and create an agile BI/DW framework.

That’s it for today’s report. I should have another report for you tomorrow on activities today!

Cheers.

Kent

P.S. Yes there was eating and drinking around the French Quarter after hours. Even got to have a drink with Doc Hendley and his wife on Bourbon Street. That was a nice treat.

Post Navigation