The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “Data Warehouse”

Data Vault vs. The World (1 of 3)

Okay, maybe not “the world” but is does sometimes seem like it.

Even though the Data Vault has been around for well over 10 years now, has multiple books, video, and tons of success stories,  I am constantly asked to compare and contrast Data Vault to approaches generally accepted in the industry.

What’s up with that?

When was the last time you got asked to justify using a star schema for your data warehouse project?

Or when was that expensive consulting firm even asked “so what data modeling technique do you recommend for our data warehouse?”

Oh…like never.

Such is the life of the “new guy.” (If you are new to Data Vault, read this first.)

So, over the next few posts, I am going to lay out some of the explanations and justifications I use when comparing Data Vault to other approaches to data warehousing.

The first contestant: Poor man’s ODS vs. Data Vault

This approach entails simply replicating the operational (OLTP) tables to another server for read only reporting. This could be used as a partial data warehouse solution using something like Oracle’s GoldenGate to support near real time operational reporting that would minimize impact on the operational system.

This solution, however, does not adequately support needs for dimensional analysis nor would it allow for tracking of changes to the data historical (beyond any temporal tracking inherent in the OLTP data model).

A big risk of this approach is that as the OLTP structures continue to morph and change over time, reports and other extracts that access the changed structures would of course break as soon as the change was replicated to the ODS.

How does Data Vault handle this?

Data Vault avoids these problems by using structures that are not tightly coupled to any one source system. So as the source systems change we simply add Satellite and Link structures as needed.  In the Data Vault methodology we do not drop any existing structures so reports will continue to work until we can properly rewrite them to take advantage of the new structure.  If there is totally new data added to a source, we would probably end up adding new Hubs as well.

An additional advantage is that because Data Vault uses this loosely coupled approach we can load data from multiple sources. If we replicate specific OLTP structures, we would not be able to easily integrate other source system feeds – we would have to build another repository to do the integration (which would likely entail duplicating quite a bit of the data).

Don’t get me wrong, there is nothing wrong with using replication tools to build real time operational data stores.

In fact it is an excellent solution to getting your operational reporting offloaded from the main production server.

It is a tried and true solution – for a specific problem.

It is however, not the right solution if you are building an enterprise data warehouse and need to integrate multiple sources or need to report on changes to your data over time.

So let’s use the right tool for the right job.

Data Vault is newer, better tool.

In the next two posts I will compare Data Vault to the Kimball-style dimensional approach (part 2 of 3) and then to Inmon-style 3NF (part 3 of 3).

Stay tuned.

Kent

P.S. Be sure to sign up to follow my blog so you don’t miss the next round of Data Vault vs. The World.

 

Happy 2013! What will you do this year?

Happy New Year! Welcome to year #2 of the Oracle Data Warrior.

I hope everyone is looking forward to a bright, happy, and successful year (however you measure it).

For me it will be a year of figuring out my long term business model (maybe?), writing a few more short ebooks (stay tuned), doing my Oracle ACE Director thing, continuing to work as a Data Vault and Data Warehouse advisor and consultant,  presenting at RMOUG, KScope13, and hopefully a few other choice events, and of course writing on this blog (and practicing my martial arts).

That ought to do it, don’t you think?

But you never know what life may throw your way, so I am not tied to any of that really, but that is where my wave seems to be heading today.

One thing I have already done was to take advantage of Vizify to build a visual story about myself. I really like the look and feel of the app and the way it presents my information. Check out the animation on the location page and then the timeline on the career page (which is not quite complete yet). Very cool.

How about you? What is on your horizon for 2013?

Cowabunga!

Kent

P.S. See this cool 2012 report WordPress generated automatically. It covers the stats I put in my last post but much nicer presentation. 😉

The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.

Here’s an excerpt:

4,329 films were submitted to the 2012 Cannes Film Festival. This blog had 23,000 views in 2012. If each view were a film, this blog would power 5 Film Festivals

Click here to see the complete report.

2012: Year in the Life of an Oracle Data Warrior

Hard to believe it is nearly the end of the year. But…it is here.

I will be taking time until the end of the year so I am doing my “year-end” post now.

It was a significant year for me with many new things, events, conferences, and clients. Here is a list, by month of a few of them:

January

I launched this blog – Oracle Data Warrior! At the stroke of midnight on January 1, I hit publish for this posting. So far I have had over 22,000 views on the site with the best/biggest day drawing 294 views on September 24th. People came to check out a free promotion for my new Kindle book.

So far 78 of you have subscribed to this blog and hence get notification whenever I post something new.

Thanks for your support! (For the rest – subscribed now so you don’t miss anything in 2013).

In January I also launched the Year of the Data Vault by going to Dan Linstedt’s Data Vault certification class in Montreal. It was a great class. Check the January archive for my posts about the class.

February

I posted what has turned out to be THE most popular article so far: The best FREE data modeling tool ever. So far it has had 8,213 views! Wow! (of course since a bunch of you just clicked the link that number has gone up again)

Also big in February (every year) is the RMOUG Training Days in Denver, Colorado. This year I did the first ever remote presentation via skype as part of their pre-conference seminar on data warehousing. My presentation was, of course, on Data Vault. There were a few technical issues but with the help of my good friend Jerry Ireland we got through it fine.

(Note: For RMOUG 2013, I will actually be presenting in person).

March

Two really big things this month:

  1. I filed with the state of Texas and formed Data Warrior LLC, signed my very first 1099 (independent) contract and became an official business.
  2. The Data Vault Training Portal was launched. You can read my post about that here.

April

Business wise, I started the 1099 contract work at MD Anderson Cancer Center and got to work building a data vault for one of their internal projects.

On the blog, I made some modification to the layout and added a War Chest page with links to some resources that cost a little money (as opposed to my White Paper page which has Free stuff).

May

After one month of being an independent contractor I bought my first smartphone – an LG Nitro. I am not really a huge gadget guy so I had put this off for sometime but finally gave in so I could tweet at the upcoming ODTUG conference in San Antonio.

Of course this means I signed up for Twitter. You can find me there at https://twitter.com/KentGraziano.

June

June was  HUGE month.

  1. The Data Vault modeling book, hit #1 on Kindle.
  2. I got “promoted” to Oracle ACE Director (and found out via a Facebook post!).
  3. And of course there was KScope12 in San Antonio, Texas. I taught Chi Gung every morning at 7 AM and blogged about the event every night (at about midnight). Just check my June archives for all the posts and plenty of pictures.

July

Slowed down a bit here. Recovered from KScope12 (started planning for KScope13). Wrote a bit about work/life balance and posted this cool InfoGraphic.

August

Another first for me in August was I published my first eBook on Kindle about data model design reviews.

Then we had an excellent family vacation with my father back east. We drove through the Adirondack Mountains in New York State and then to the Green Mountains of Vermont where we stayed at the Trapp Family Lodge. It gets my highest recommendation for a family friendly, environmentally aware, upscale, outdoor vacation resort. Pay the money and go – you only live once!

While on the trip, my nine year old son came up with a great idea for a blog post: How to make data modeling fun. When we got back, I wrote and posted it here. (Soon it will be a presentation at a conference near you)

September

This was another big and fun month – all about Oracle Open World 2012 and getting to attend my first Oracle ACE Director meeting at Oracle HQ. Like at KScope, I blogged every night in the wee hours to capture what I saw and learned that day. The smart phone got a lot of use taking pictures in session and around San Francisco. It is all logged in the September archives.

October

Actually OOW 2012 bled over into October so there are even more posts and pictures in the October Archive folder.

The other biggie in October was that I finished out my contract at MD Anderson Cancer Center and started a new gig at McKesson Specialty Health (US Oncology). This has turned out to be a great project with a good team (like I had at MD Anderson), but with the added benefit of only being 9 miles from my house. This is the shortest commute I have had since college! Saves me 2.5 hours a day in driving.

Needless to say, that is a very nice aspect of the job.

November

This month was less about data (and my normal work) and more about fitness, a new habit, and being a warrior. (Though I did get accepted to present at the RMOUG Training Days in Denver.)

The highlight of the month was attending the 20th Anniversary celebration for the International Combat Hapkido Federation. I have been attending their workshops and seminars for over 15 of those years and have had the privilege to train with several of their master as well as their founder and grand master John Pellegrini. Combat Hapkido is a very practical martial art for self-defense and a lot of fun to learn and practice.

It was a great event with back to back workshops (i.e., work outs!) with many masters and grand masters. We got training in Tai Chi, stretching, conditioning, kicking, Filipino Escrima, ground survival, and pressure points. There were actual martial arts celebs in attendance including Bill “Superfoot” Wallace, Cynthia Rothrock, and Stephen Hayes.

Since my main art is Tae Kwon Do, I was very privileged to meet and train with Grandmaster Bill Wallace (who actually has signed my last two black belt certificates along with GM Pellegrini). GM Wallace’s session was challenging and fun. He is quite entertaining.

Me (right) with GM Superfoot Wallace (center)  and Master Ramon Voils

Me (right) with GM Superfoot Wallace (center) and Master Ramon Voils

At 67 years old, GM Wallace can kick faster and higher than pretty much everyone I have every trained with. I can only hope to be doing so well when I reach that age.

This why he is called "Superfoot"

This is why he is called “Superfoot”

For more pictures from the event, you can subscribe to my newsfeed on Facebook or like my page. You might even find a picture of me in a suit!

December

And now we are up to this final month of 2012. I have been very busy with my work at McKesson so have only got one post out about the newest release of SQL Developer Data Modeler (which I use nearly every day!).

I did however recently get notification that I had several papers accepted for presentation at the ODTUG  KScope13 conference in New Orleans next June. Be sure to register for that event too!

Yes it was quite the busy year…

Stay tuned for 2013 and see what happens.

Merry Christmas and Happy New Year!

Kent

The Oracle Data Warrior

List of Top Data Vault Resources (UPDATED 2016)

As I finished out my latest contract, my team mates wanted to know where they could go to get their data vault questions answered (besides emailing me!).

So I put together this list for them and figured the readers of my blog would probably like to see the same list.

Here it is!

My Stuff

Introduction to Data Vault 1.0 (pdf):

https://kentgraziano.com/white-papers/

Book:

Intro to Agile Data Engineering Using Data Vault 2.0

Slides

Introduction to  Data Vault and Why Data Vault?  (ppt):

http://www.slideshare.net/kgraziano/why-data-vault?

http://www.slideshare.net/kgraziano/agile-data-warehouse-modeling-introduction-to-data-vault-data-modeling

Dan’s Data Vault Books

The NEW Data Vault 2.0 Book:

http://www.amazon.com/Building-Scalable-Data-Warehouse-Vault-ebook/dp/B015KKYFGO/

The Data Vault Modeling book (DV 1.0):

http://www.amazon.com/Super-Charge-Your-Data-Warehouse/dp/1463778686/

The Data Vault Modeling book – Kindle version:

http://www.amazon.com/Super-Charge-Your-Warehouse-ebook/dp/B00853265G/

The Data Vault Modeling book – downloadable PDF version:

http://learndatavault.com/books/super-charge-your-data-warehouse/

Data Vault Implementation using Pentaho (DV 1.0):

http://www.lulu.com/shop/peter-van-til/implementing-a-datavault-architecture-with-pentaho-data-integration/paperback/product-17580260.html

Around the Web

Dan has two online classes for Implementing Data Vault (1.0):

  1. Using Informatica. You can see that here.
  2. Using SQL. You can see that here.

Dan’s main site and blog – Subscribe to this to get email updates/announcements regarding data vault:

http://danlinstedt.com/

Best overall source of Q&A – Data Vault Discussion group on LinkedIn:

http://www.linkedin.com/groups?gid=44926&trk=hb_side_g

Martin Evers,  data vault expert from Europe,  (just one of his articles) :

http://dm-unseen.blogspot.nl/2012/10/data-vault-business-key-mutations-matter.html

On YouTube

Data Vault videos from Dan (and Sanjay):

http://www.youtube.com/user/learndatavault

Older videos (includes RapidACE demo):

http://www.youtube.com/user/dlinstedt/videos?sort=dd&flow=list&page=1&view=0

Data Vault Architecture:

http://www.youtube.com/watch?v=WmFENnqgoS0&feature=youtu.be&a  (BTW – turn the volume down first. The “theme” music is loud)

Well that’s the main ones for now.

What’s your favorite?

Enjoy!

Kent

The Data Warrior

Data Vault Master and CDVP2

Authorized DV Bootcamp Instructor

Standards? We Don’t Need No Stinking Standards!

Well, actually we do need standards.

Especially if we want to have any consistency in the systems we develop, or the models we build.

For years people in the data warehouse arena have literally begged Dan Linstedt, inventor of the Data Vault Model and Methodology, to create books and training materials on Data Vault.

They wanted to know how he got the results he was getting for his clients.

They wanted to understand how to properly build a Data Vault.

They wanted STANDARDS.

Well, ask and ye shall receive.

Recently Dan put this on his blog:

For many many years I have written and maintained Data Vault Standards v1.x.  Well, I’ve released them on Amazon for you.  These are the DV1.0 Standards, and are the same standards document I used to hand out in my certification classes.

Apparently there are folks out there who either don’t know about the standards, or who have had some confusion over the fact that there ever were standards.

I wanted to make it free – but unfortunately I was not able to do that.  So, I’ve made the price to be $0.99 USD on Amazon.  Again, these are Data Vault 1.0 Modeling Standards and Data Vault 1.0 Loading Standards.

via Data Vault Modeling & Methodology – Data Vault 1.0 Modeling and Loading Standards.

So there you have it – the official standards for Data Vault 1.0.

Go get ’em here.

Read ’em.

Use ’em.

Your data warehouse will thank you.

Kent

Oracle Data Warrior

Post Navigation