The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the category “Big Data”

The Hills were Alive with the Sounds of #DataVault

Yes folks a few weeks back we held the 2nd Annual World Wide Data Vault Consortium (#WWDVC) at the lovely Trapp Family Lodge outside Stowe, Vermont. What a great venue! Beautiful scenery, near perfect weather, great food, and great beer (they have their own brewery). Standing on the hillside it is easy to see why the Von Trapp Family Singers (you know from “The Sound of Music”) decided to settle here to build their new life in America.

What a view!

What a view!

Of course the learning and networking were outstanding again. This year was even better than last year.

Why?

  1. Location, location, location
  2. It was May (so much warmer that last year in St Albans – brrrr.)
  3. Dr. Claudia Imhoff gave the keynote! I love her new concept #XDW – the Extended Data Warehouse.
  4. Scott Ambler talked about Agile DW! It takes a Disciplined approach to be agile.
  5. Dan talked about DV 2.0 and Big Data.
  6. Sanjay showed us how he built a DV 2.0 platform on Hadoop.
  7. Multiple, real world case studies of DV 2.0 working in the wild around the globe.
  8. I gave two talks and showed models and code from one of my recent adventures.
  9. Five members of the Boulder BI Brain Trust (#BBBT) in attendance.
  10. We had multiple 30-minute networking sessions between the talks (who does that?). Plenty of time to ask questions and get to know each other.
  11. Three (count ’em 3!) global software vendors with off the shelf tools that support the automatic generation of DV 2.0 compliant components. Wow!
  12. BBQ dinner hosted by AnalytixDS. Yum!
  13. Crazy shirt day and contest.
  14. And did I mention three days of face-to-face networking with world-renowned experts. (I got to have lunch with Claudia Imhoff AND Scott Amber at the same time – a once in a lifetime opportunity)
  15. Fresh German-style craft beer.
  16. Bavarian pastries from the in house bakery.
  17. Did I mention the food?
  18. The view.
  19. The hiking. (Good to get outside and exercise after all those sessions.)
  20. The mountain biking (after the conference of course).

As if that was not enough, I was privileged to attend an exclusive workshop/mentoring/Q&A session with Dan the day before the event, where he told us about new, as yet unpublished DV 2.0 additions, explained in depth the zero-key concept, the right way to use hash keys, 3 stages of managed self-service BI, and a host of other topics and issues we all wanted feedback on. My brain was tired before the conference even started.

Hint: if you want to get invited to that special session next year, you need to get DV 2.0 certified ASAP. Keep an eye on LearnDataVault for Dan’s teaching and speaking schedule and locations or contact me about setting up a class if you can’t make one of his (I am an authorized DV 2.0 Bootcamp Instructor too).

Bummed out now that you missed all this great learning? Not like I did not warn you!

Well first, you can catch a lot of the action and a bunch of pictures by mining the Twitter stream for #WWDVC. But since I know you are all too busy (or lazy?), here it is for you:

Really wish you were there? Really?

You are in luck because Dan managed to record some of the best session on video! The videos and all the PowerPoint presentations are now available, for sale, on the Data Vault learning site. Just check out this offering WWDVC 2015 Videos. In addition to the videos listed, you get all the other presentation materials from the speakers (including me).

Right now the cost is $499 (yup more than the conference but hey, no travel expense). Since you are a loyal reader of my blog, you can get a 20% discount off that by using the coupon code KENT10S during checkout.

Even without the discount, it is more than worth the money. The video of Claudia’s keynote and Scott Ambler’s talk are worth that much alone.  The videos are high quality and both of them are amazing speakers. (FYI – some of the videos are very long and may take a minute or two to load depending on your internet connection)

So that is my short review of WWDVC 2015. Glad I was able to be a part of this great event!

VonTrappLodge2Keep you eyes on http://wwdvc.com/ for the announcement of the 2016 event and the call for papers (which will open soon).

See you next year? (Somewhere near Stowe again)

Kent

The Data Warrior

P.S. Dan’s newest book that covers Data Vault 2.0 is now available for pre-order on Amazon. Get a preview of Dan’s new DV 2.0 book.

Better Data Modeling: 7 Differentiating Characteristics of Data Vault 2.0

Hard to believe that the 2nd Annual World Wide Data Vault Consortium (WWDVC15) is NEXT WEEK in beautiful Stowe Vermont. It promises to be an excellent event. The speakers include myself, Claudia Imhoff, Dan Linstedt (the inventor of Data Vault), Scott Ambler, Roelant Vos, Dirk Lerner and many more. The focus will be DV 2.0, agile data warehousing, big data, NoSQL, virtualization and automation. Check out the agenda here: http://wwdvc.com/schedule/

So in preparation (and to encourage you to attend), I thought it might be good to review some of the important basics about Data Vault 2.0 and why it is an important evolution for the data warehousing community.

The approach started out as the Common Foundational Warehouse Modeling Architecture as it’s official name. Then it was more commonly known as the “Data Vault” and became a modelling method for Data Warehouses. It also had a methodology with implementation guidelines and worked very, very well on relational platforms for many, many years (over 10 years for those who did not know).

But technology evolved. NoSQL architectures came into the picture primarily as sources. The Apache Hadoop platform started offering a cheaper storage and processing MPP architecture.

Data Vault evolved into Data Vault 2.0 and already has many successful implementations. The original Data Vault is now referred to as Data Vault 1.0 (or DV 1.0) and it primarily has a modelling focus. DV 2.0 on the other hand changes some things, and adds a LOT.

Data Vault 2.0 has the following 7 differing characteristics:

1. DV 2.0 is a complete system of Business Intelligence. It talks about everything from concept to delivery. While DV 1.0 had a major focus on modelling and many of the modelling concepts are similar, DV 2.0 goes a step further and talks about data from source to business user facing constructs with guidelines for implementation, agile, virtualization and more.

2. DV 2.0 can adapt to changes better than pretty much ANY other data warehouse architecture or framework. It can do it even better than DV 1.0 because of the change in design to adapt to NoSQL and MPP platforms, if needed. DV 2.0 has successfully been implemented on MPP RDBMS platforms like Teradata as well (ask Dan for details).

3. DV 2.0 is both “big data” and “NoSQL” ready. In fact, there are implementations where data is sourced in real-time from NoSQL databases with phenomenal success stories. One of these was presented at the WWDVC 2014 where an organization saved lots of money by using this architecture.

A near real-time case study for absorbing data from MongoDB is being presented at WWDVC2015. It’s not to be missed.

4. DV 2.0 takes advantage of MPP style platforms and is designed with MPP in mind. While DV 1.0 also did this to an extent, DV 2.0 takes it to a completely other level with a zero-dependency type architecture. Of course, there are a few caveats you will need to learn.

5. DV 2.0 lets you easily tie structured and multi-structured data together (logically) where you can join data across environments easily. This particular aspect lets you build your Data Warehouse on multiple platforms while using the most appropriate storage platform to the particular data set. It lets you build a truly distributed Data Warehouse.

6. DV 2.0 has a greater focus on agility with principles of Disciplined Agile Delivery (DAD) embedded in the architecture and approach. Again, being agile was certainly possible with DV 1.0, but it wasn’t a part of the methodology. DV 2.0 is not just “agile ready”, it’s completely agile.

7. DV 2.0 has a very strong focus on both automation and virtualization as much as possible. There are already a couple of automation tools in the market that have the Dan’s approval (just ask). Some of them will be at WWDVC15.

It’s real-time ready, cloud ready, NoSQL ready and big data friendly. And practitioners have already had success in all these areas (on real projects not just in the lab).

And, as you’ll notice on the agenda, the focus at WWDVC15 will be Data Vault 2.0 with examples of sourcing it from MongoDB, with examples of virtualization (from me!), with examples of design mods (also one from me), with examples of Hadoop implementations and more. It’s not something you want to miss, and there’s hardly any time or seats left.

If you are coming, I look forward to seeing you and chatting about the world of DW/BI and agile. If you want to attend, grab one of the last seats over at http://wwdvc.com/#tile_registration  (if there are still seats left by the time you get this message).

See you soon!

Kent

The Data Warrior

P.S. After the conference, the next place you’ll hear about DV 2.0 is in Berlin. There is a bootcamp and certification starting on 16th June at Berlin, Germany. The details are here: http://www.doerffler.com/en/data-vault-training/data-vault-2-0-boot-camp-and-certification-berlin/

Are you ready to learn something new in 2015?

We all know the saying:

When the student is ready, the teacher will appear

My advice is to empty your cup, daily, so that when the teacher appears you will recognize them.

Unless we are humble in our hearts and in our spirit, we are not open to new things and to learning. The teacher we need may be standing right in front of us the entire time, and yet we may miss the opportunity to learn from them because we are not open to the possibility that they may have something to teach us.

Always be ready to learn.

Happy New Year!

Kent

The Data Warrior

P.S. If your organization is ready to learn how to better organize and understand their data, then lets talk and see if I can help.

Say “Big Data” One More Time (I dare you!)

This is quick. Saw it on Twitter this morning and it is just too funny to not share:

Have a great day!

Post Navigation