The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the category “Data Vault”

Live from the 1st Annual World Wide Data Vault Consortium: Day 3

Well it was the last day of the 1st annual WWDVC. What an event is has been. (See recap of Day 1 and Day 2, here and here)

The sign outside the meeting room

The sign outside the meeting room

Don’t forget you can see all the action by searching on #WWDVC on twitter.

Agility and Data Vault

Long time data vault advocate Tom Breur opened our closing day with a talk about how we should strive to build the right product and build it right without creating more technical debt in the process.

He said agile is about taking small steps not about being faster. If we do this right, the solution we deliver should generate more legitimate business requirements.

He encouraged us all to read the Theory of Constraints and The Goal (I have), and to learn about lean delivery. Our goal should be to deliver continuously and consistently. The shorter the sprint, the better.

Next he went on to tell us his conclusion that while building a Kimball based solution may appear to deliver more value to the business, it takes too long. And in the end it is a fragile and rigid solution subject to major re-work when requirements change after deployment.

We can deliver value, quicker, using the data vault method. And what we deliver can be done incrementally and more easily added to without re-work.

Tom drew this to show how much quicker we can deliver some value sooner

Tom drew this to show how much quicker we can deliver some value sooner

Data Vault Case Studies

John Sells and Josh Bartells from Data Blueprint shared with us their experiences implementing data vault solutions for their clients.

John and Josh several successful DV projects

John and Josh share several successful DV projects

Why Data Blueprint decided to use DV as their consulting

Why Data Blueprint decided to use DV as their consulting

The gujys from DataBlueprint discuss hurdles and objections they encountered selling DV to clients

The guys from DataBlueprint discuss hurdles and objections they encountered selling DV to clients

It was great to hear about their success stories from the field and see how they addressed the challenges many of us have faced.

Data Vault Modeling Tool

MID GmbH from Germany has a pretty nice modeling tool with built in capabilities to support modeling a data vault solution from stage tables through reporting (including using the new DV icons).

Nice ability to visually show relationships across diagram types

Nice ability to visually show relationships across diagram types

MID's modeler has nice diagram to show all sorts of metadata relationships

MID’s modeler has nice diagram to show all sorts of metadata relationships

This is a tool worth checking out if you are doing a lot of data vault modeling.

Concluding Remarks

Dan closed out this inaugural event with a few remarks, some memories, thanks, and talk about plans for doing this again next year.

Dan closes out the 1st Annual WWDVC and asks about next year - where and when?

Dan closes out the 1st Annual WWDVC and asks about next year – where and when?

Guess I need to be careful what I say! :-)

Guess I need to be careful what I say! 🙂

One last time I have to say I am glad I came and can’t wait to do it again next year!

Snowy St Albans

Snowy St Albans

So long for now from Vermont.

Don’t forget to check out LearnDataVault.com and get ready to join us next year!

Kent

Live from the 1st Annual World Wide Data Vault Consortium: Day 2

Wow has is been an amazing event with many amazing people from around the globe. I am VERY glad that I took the time and came to St Albans for this inaugural event.

As with Day 1, there is just too much great information for me to adequately cover it in a blog post (or 10!), so I will give you some highlights and lots of pictures.

The good news is that sometime in the next few months you will be able to purchase access to a video of the entire event on LearnDataVault (then you will be really bummed about not having come in person). Dan and Sanjay figure with their schedules it will take a few months to produce a top quality video. The good news is we had professional videographers for all three days and they filmed every keynote and every talk.

I will let you know when the video is ready.

Dan Does a Deep Dive on Data Vault 2.0

Due to popular demand, Dan actually changed the agenda and made the first session of Day 2 a detailed look at some of the more important aspects of Data Vault 2.0. This is material that he has not really written much about and is only available otherwise in his DV 2.0 bootcamp class.

Here are some of the highlights (again to see even more details check #WWDVC on twitter).

Opposing goals between DW storage and Mart presentation have led to many failed DW/BI projects.

Opposing goals between DW storage and Mart presentation have led to many failed DW/BI projects.

BTW – Dan would like to see us stop calling the reporting side “Data Marts” and start calling them “Information Marts”. To be agile we have to stop mixing the raw data storage (EDW) with the reporting that has applied business rules.

Dan went on a bit of a rant about the need to measure things.

DV 2.0 Methodology helps us be more precise to be more successful

DV 2.0 Methodology helps us be more precise to be more successful

Then we had a discussion about DV 2.0 architecture and methodology and how it supports and fits into the knowledge pyramid.

Dan introduces us to  DV 2.0 and the Knowledge Pyramid

Dan introduces us to DV 2.0 and the Knowledge Pyramid

The DV 2.0 Architecture

The DV 2.0 Architecture

The some discussion about DV 2.0 agility backed up by a real user case study.

A DV success at Qsuper

A DV success at Qsuper

It sure was great to see real numbers on a DV 2.0 success story!

So how did they do it? Dan then showed us his recommended approach to requirements gathering that helps the process become more agile.

DV 2.0 approach to getting better requirements faster

DV 2.0 approach to getting better requirements faster

Then finally another rant about how to get better performance from our systems.

Dan's Rules of Performance

Dan’s Rules of Performance

Roelant Vos (Analytics8) talks about DV Automation

Characteristics of ETL for Data Vault

Characteristics of ETL for Data Vault

Roelant Vos discusses  generating Data Vault ETL from meta data

Roelant Vos discusses generating Data Vault ETL from meta data

Roelant uses Lego as a great analogy for the patterns in Data Vault and why it is possible to auto-create ETL code.

Model driven ETL generation is possible

Model driven ETL generation is possible

What is needed  to support generation of ETL

What is needed to support generation of ETL

With all this in mind, Roelant has built a nice little kit to actually generate DV ETL code for his clients. Nice job! You can follow Roelant on twitter here.

Doug Needham (ClearMeasure) blows our minds with Data Vault Math.

There were a ton of tweets about this session. Doug has done some incredibly innovative work on using mathematics to determine the quality/completeness of a data vault model and some slick ways to validate that it has been loaded correctly. Lots of talk about graph theory and calculating edges ensued.

Doug blows our minds with math

Doug blows our minds with math

Research on how to determine the driving key for a Link

Research on how to determine the driving key for a Link

More calculations from Doug - vector math!

More calculations from Doug – vector math!

This is the first time I have met Doug, a former marine corp DBA. We have found we have a lot of experiences and thoughts in common, including a common client! He even grew up in Texas in the town where I now live!

In this session Doug famously said “If you do not know what a hypothesis, control group or A/B test is, you are NOT a data scientist.”

One of the great things about these event is the people you meet. I am glad to have met Doug.

Using Oracle SQL Developer Data Modeler for DV

After a much needed brain break from Doug’s talk and some networking time, I got to take the floor again to show my favorite FREE data modeling tool. I did my usual top 10 type talk but with a slant towards how I leverage all those features to support a data vault modeling project.

Data Vault Diversity: The former Marine mathematician and the long haired enviromentalist

Data Vault Diversity: The former Marine mathematician and the long haired environmentalist

There were lots of good questions and interaction and lots of interest in how I have built virtual data marts on top of data vault warehouses.

Starting my SDDM intro talk

Starting my SDDM intro talk

And that was it for Day 2. We all took a break then had a happy hour and dinner (with demo) sponsored by AnalytixDS. Good food and good fun!

Data Vault geeks from around the world in snowy St Albans, Vermont

Data Vault geeks from around the world in snowy St Albans, Vermont

More to come on Day 3!

Kent

Live from the 1st Annual World Wide Data Vault Consortium: Day 1

It has been an amazing and historical day in chilly St Albans, Vermont on this first day of Spring.

In a little hotel on the edge of a little town is a gathering of some of the brightest minds in the world of data vault. They have come from around the USA, Canada, Germany, Australia, The Netherlands, and Norway.

Given the location and distance to get here, these are some dedicated folks intent on being the best they can be at their chosen profession.

Like any great user group meeting there was a keynote, user sessions, and lots of networking.

Look at #WWDVC to get the details of all the action for the day. Below I will provide a few pictures to highlight the day.

Dan Linstedt opens the 1st Annual World Wide Data Vault Consortium

Dan Linstedt opens the 1st Annual World Wide Data Vault Consortium

As part of his keynote Dan gave us some highlights from Data Vault 2.0.

DV 2.0 is a complete system for developing an enterprise DW/BI solution

DV 2.0 is a complete system for developing an enterprise DW/BI solution

Part of DV 2.0 is the methodology has officially adopted an agile approach

Part of DV 2.0 is the methodology has officially adopted an agile approach

Of course there were some announcements …

Dan had a lot of partnerships to announce

Dan had a lot of partnerships to announce

Next we heard from Dan’s business partner Sanjay about how  DV 2.0 works with “big data.”

Sanjay talks about using Hadoop and NoSQL with Data Vault 2.0

Sanjay talks about using Hadoop and NoSQL with Data Vault 2.0

In DV 2.0 it is possible to have satellites reside on Hadoop

In DV 2.0 it is possible to have satellites reside on Hadoop

In DV 2.0 using hashed keys allows linking to data in a NoSQL db

In DV 2.0 using hashed keys allows linking to data in a NoSQL db

After a great networking session and a good lunch I gave a 2 hour session on using standards for data modeling (with a few side rants and lots of Q&A).

//

Staring my talk about saving $$$ by using standards

Staring my talk about saving $$$ by using standards

Next we had a great Skype session with Bill Inmon with lots of Q&A.

And finally we had a talk from my friend Rafael about generating data vaults using WhereScape.

Raphael from WhereScape shows us Data Vault automation

Raphael from WhereScape shows us Data Vault automation

Well, that’s it for now. More tomorrow.

Join us on twitter to follow all the action live. (click the links to go right to Day 2 and Day 3 posts)

Kent

Better Data Modeling: The Oracle Data Warrior Speaks!

Looks like I will be doing a bit of speaking this year at a number of  events around the country, and later on, the globe (more on that later).

As usually all my talks will center around using SQL Developer Data Modeler, data modeling standards, Data Vault, agile, or a combination of all of the above.

If you have budget and time, please come out to at least one of these events this year, I would love to meet you in person and talk about the world of Oracle and Data Modeling.

If you aren’t planning to attend one of these – WHY NOT?

These are all great events with tons of learning opportunities. The networking alone is worth the price of admission.

Here is a list of the first three events confirmed on my calendar (and SURPRISE – they are NOT all Oracle related events):

RMOUG Training Days

In less than two weeks: The Rocky Mountain Oracle Users Group Training Days 2014 in Denver, Colorado. This runs from Feb 5-7 , will have at least 1,000 people and you cannot beat the price.I will be presenting Friday at 1:30PM on how I save my clients big $$ by applying repeatable processes and standards to my data models.

Follow it on twitter with #RMTD14.

Data Vault Consortium

Next up March 20 – 22nd , I will be participating in the 1st ever World Wide Data Vault Consortium and User Group meetup in beautiful northern Vermont near the home of my good friend, the inventor of the The Data Vault Model and Methodology, Dan Linstedt. I will be speaking about agile and data warehousing, using SDDM to do Data Vault modeling, and no doubt engaging in some lively debates with Data Vault experts from around the globe. Check out the agenda on the event page for more details on who will be speaking (hint: Bill Inmon, father of  data warehousing is participating!).

Enterprise Data World 2014

The #EDW14 event is really the annual conference put on by DAMA International and the speaker list is a veritable who’s-who of the data architecture and modeling world. This year the event is in Austin, Texas on April 27 – May 1. Since that is quite close to where I live, I figured I would submit an abstract and I was honored to be accepted. I have attended this event only once before when it was in Denver (a long time ago!) and have been a member of DAMA on and off for years, but this is the first time I have been asked to speak. I am looking forward to it for sure (not sure how I will fit my talk into a 45 minute slot!). Sign up for it here.

If you are planning to attend any of these, drop me a line over Twitter or LinkedIn so we can plan to meet up.

Later.

Kent

The Oracle Data Warrior

Better Data Modeling: Color Code Your Data Model Diagrams using #SQLDevModeler

One of the standards I recommend in my book  Check List for Doing Data Model Design Reviews is to use color in your diagrams to visually differentiate types of entities or tables.

As luck would have it, Oracle SQL Developer Data Modeler has a feature that makes this very easy. It is Classification Types.

In the latest version. 4.0, you set these up by going to the context menu on the Design level. From that menu pick Properties. Once on the property dialog go to Settings -> Diagram -> Classification Types. (In 3.x look under Tools -> Preferences)

The default install comes with a bunch already – fact, dimensions, logging, summary, and temporary. Each has a pre-set color assigned. You can change that color by clicking on the color and selecting another option from the pallet. You can also set a prefix for each type. (Note – if you are already using a classification and change the color, when you hit apply the new color will be applied in all existing diagrams within the design.)

You add new types by clicking the green plus (+) sign and then just add in whatever you want and save.

For Data Vault modeling, I add three new types: Hub, Link, and Satellite with the colors you see in the screen shot here.

Using Classification Types to Color Code Your Diagrams

Using Classification Types to Color Code Your Diagrams

To apply a classification type to an existing table, open the table property dialog and look for the classification types node in the tree (in 4.0). In 3.x, there is a simple classification type drop down on the main property page.

Once applied, the first letter of the classification type will appear in the upper left corner of the table (see screen shot).

Another way I have used this recently was in my current data warehouse project where I have source, stage, and dimensional tables all in one design. I found I often want to show all three tiers in on diagram (sub view) for a sprint (we are using a SCRUM approach) to help the ETL programmers and QA folks have one place to go where they can see how these layers are related. So for this project, I also added a source and stage classification type.

So if you have been color coding you diagrams by hand, this tip should save you a bunch of time since you won’t have to pick the colors by hand on each table. Plus the color selection will be more consistent.

If you aren’t color coding, now would be a great time to start!

Bonus Tip: If, like me, you want to be consistent across all your designs with the types and colors, I just figured out I can hack the dl_settings.xml file to copy my classification type customizations from one design to another. Just be sure to exit and then restart SDDM after you update the file for it to take effect.

Have fun coloring your diagram! (Maybe more people will read them)

Kent

The Oracle Data Warrior

 

Post Navigation