The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “Data Vault”

Better Data Modeling: What is #DataVault 2.0 and Why do I care?

Have you heard?

Dan Linstedt has just had his new book published on Data Vault 2.0. It is called Building a Scalable Data Warehouse with Data Vault 2.0. If you are at all into data warehousing and agile design, you need to get this book now. So click here and be done.

For those of you not sure what this DV 2.0 stuff is all about and why you might want to learn about it, I recently did a series of guest posts for Vertabelo to introduce folks to the concepts. In the series I walk you through some of the history of Data Vault and why we need a new way to do data warehousing. Then I get into the basics of modeling the Data Vault, the Business Vault, and finally building Information Marts from a Data Vault.

So you can find the posts here:

Data Vault 2.0 Modeling Basics

The Business Data Vault

Building an Information Mart with Your Data Vault

Once you have read these, I am sure you will want to go buy the new Data Vault 2.0 book and maybe sign up for some online training on DataVaultAlliance.com.

Model on!

Kent

The Data Warrior

P.S. If you want to catch up, you can still purchase the original Data Vault (1.0) modeling book Super Charge Your Data Warehouse. It is a great reference book to have on hand (you can get it on Kindle too). Might as well have the whole set.

P.P.S. I turned this series into a Kindle ebook for easier reference, you can find it on my Author Profile or just click on the book cover in the right side bar above.

You asked for it! #DataVault 2.0 Boot Camp with The #DataWarrior in The Woodlands, Texas

Yes it is on! This is your one chance this year to come to Texas and learned Data Vault

WHEN: Tuesday, September 15, 2015 at 8:30 AM – Friday, September 18, 2015 at 12:00 PM (CDT)

WHERE: The Woodlands, TX

How Much?

Early Bird Price: If you purchase before August 20, 2015 you get this discounted rate of $2,695.00

Standard Price: $2,895.00

This is a small intimate venue, so the total number of attendees is strictly limited.

Event Details

Data Vault 2.0 Boot Camp & Private Certification, hosted and taught by me, The Data Warrior, in the beautiful Woodlands, Texas! (This is the same class as developed and taught by Dan Linstedt)

THIS IS A 3 DAY COURSE. The morning of the 4th day (Friday) will be offered as a 2 hour exam time for those that wish to stay and take the exam. Everyone will have the opportunity to study for a week (instead of taking it the morning of the 18th), and then take the exam on-line.

This class is a 3 day intensive course that covers all aspects of the Data Vault 2.0 System of Business Intelligence. It really teaches you how to be a general practitioner, and enables you to implement Data Vault 2.0 successfully within your organization. CDVP2 (Certified Data Vault 2.0 Practitioner) is available privately for those who complete this course.

Prerequisite: This class will NOT cover DV basics, so you should have Data Vault 1.0 modeling certification or prior DV implementation experience. You should also have read the main Data Vault book – Super Charge Your Data Warehouse.

Here is the agenda for the class:

Day 1:

Business Justification:

– what, why

– where it fits

– who’s using it

Architecture:

– Systems architecture, Loading Architecture,

– managed self service BI (intro)

– where does NoSQL fit?

– integrating with Hadoop

Day 2:

DV2.0 Methodology:

– Agility (SCRUM, PMP, Six Sigma, KPA/KPI), issues, drivers

– team management, project overview

– CMMI & DV2.0 Project

DV2.0 Modeling (review section)

– Basic components, standard structures,

– reference tables

– PIT & Bridge constructs

Day 3:

DV2.0 Implementation

– Set Logic, Performance,

– Data Distribution

– ETL Templates to follow

– Working SQL example code

Review & Q&A

– Open session for questions, answers

– white-boarding

– deep-dive into specific topics

Logistics:

There are many great hotels nearby the event center. Contact me if you need a recommendation.

The event will be held in a state of the art conference room overlooking Lake Woodlands. There are plenty of restaurants and a Whole Foods within easy walking distance for lunch and happy hour (after class of course).

If you are coming from out of town, your best option is to fly into Houston Bush Intercontinental Airport (IAH). It is an easy 20-30 minute drive up to The Woodlands

So sign up now to reserve your spot via Data Vault 2.0 Boot Camp & Private Certification with The Data Warrior Tickets, The Woodlands | Eventbrite.r

See you soon!

Kent

The Data Warrior

Do you want a Data Vault 2.0 Bootcamp in The Woodlands, Texas?

Survey time peeps!

Simple Yes/No question:

Would you like me to hold a DV 2.0 Bootcamp (and private certification) west of the Mississippi?

If I set one up a in “America’s Hometown” (really it is even trademarked), The Woodlands, Texas (20 minutes north of Houston Bush Intercontinental Airport), would you come?

The Woodlands Waterway and our concert venue The Pavillion

The beautiful Woodlands Waterway and our concert venue, The Pavilion. You can take a water taxi to a concert!

Since I am an authorized DV 2.0 Trainer, I figure it is time I actually teach a class. And why not in my hometown in Texas?

So what is in a DV 2.0 Bootcamp class?

Three days of intense training on all things DV (followed by a chance to become a Certified DV 2.0 Practitioner).  NOTE: If you are new to Data Vault, you must read the Super Charge book before attending the class.

This class covers what you need to know as a practitioner in the world of Data Warehousing and Business Intelligence.  This is our foundational course.  This is a 3 day (in person) course that covers end-to-end best practices.  Major topics for this class are:

  • Architecture – Including NoSQL, Big Data, Hybrid Systems and Relational stores
  • Methodology – Including CMMI, Six Sigma, Optimization, Automation, and Generation
  • Implementation – Including Performance and Tuning, Set Logic, ELT vs ETL, Parallelism
  • Modeling – Including replacing of surrogates with Hash Keys, data layout, data co-location

This class takes you through the why/what/how of Data Vault 2.0.  It includes the coverage of the business justifications, then follows with the technical descriptions of the architecture, implementation, methodology, and modeling.  Included in the topics are reaching agility, practicing Six Sigma, measuring and optimizing at CMMI level 5, the KPA’s and KPI’s of Data Warehousing, and more.

We also discuss the use of Hadoop, and NoSQL platforms along side the relational world.  The objective is to enrich your understanding of how and when to apply Big Data Solutions.  The course finishes with descriptions on ETL and ELT design time paradigms, including templates, best practices and working SQL.  This class is a prerequisite for anyone wishing to achieve DV2.0 Certified Practitioner status.

When:

Late summer – early fall (depending on interest) of 2015

Cost:

TBD based on how much the space costs me here!

But will definitely be less than $3,000.

Where:

Someplace nice and central in The Woodlands within walking distance to great food and drink.

Apartments over looking the waterway that flows to Lake Woodlands. A great natural setting. Not the usual suburban wasteland.

Apartments over looking the waterway that flows to Lake Woodlands. A great natural setting. Not the usual suburban wasteland.

So who is in?

Please respond in the comments so I can tell if I should start setting something up.

Thanks.

Kent

The Data Warrior

P.S. Since you know I am into fitness, we have great options here to exercise here too. MIles of running trails and even kayaking on the waterway.

The Riva Boathouse were you can rent single and double kayaks by the hour (look close and you find a picture of my son and & on the sign from when they first opened).

The Riva Boathouse were you can rent single and double kayaks by the hour (look close, when you are here, and you find a picture of my son and & I on the sign from when they first opened).

Better Data Modeling: 7 Differentiating Characteristics of Data Vault 2.0

Hard to believe that the 2nd Annual World Wide Data Vault Consortium (WWDVC15) is NEXT WEEK in beautiful Stowe Vermont. It promises to be an excellent event. The speakers include myself, Claudia Imhoff, Dan Linstedt (the inventor of Data Vault), Scott Ambler, Roelant Vos, Dirk Lerner and many more. The focus will be DV 2.0, agile data warehousing, big data, NoSQL, virtualization and automation. Check out the agenda here: http://wwdvc.com/schedule/

So in preparation (and to encourage you to attend), I thought it might be good to review some of the important basics about Data Vault 2.0 and why it is an important evolution for the data warehousing community.

The approach started out as the Common Foundational Warehouse Modeling Architecture as it’s official name. Then it was more commonly known as the “Data Vault” and became a modelling method for Data Warehouses. It also had a methodology with implementation guidelines and worked very, very well on relational platforms for many, many years (over 10 years for those who did not know).

But technology evolved. NoSQL architectures came into the picture primarily as sources. The Apache Hadoop platform started offering a cheaper storage and processing MPP architecture.

Data Vault evolved into Data Vault 2.0 and already has many successful implementations. The original Data Vault is now referred to as Data Vault 1.0 (or DV 1.0) and it primarily has a modelling focus. DV 2.0 on the other hand changes some things, and adds a LOT.

Data Vault 2.0 has the following 7 differing characteristics:

1. DV 2.0 is a complete system of Business Intelligence. It talks about everything from concept to delivery. While DV 1.0 had a major focus on modelling and many of the modelling concepts are similar, DV 2.0 goes a step further and talks about data from source to business user facing constructs with guidelines for implementation, agile, virtualization and more.

2. DV 2.0 can adapt to changes better than pretty much ANY other data warehouse architecture or framework. It can do it even better than DV 1.0 because of the change in design to adapt to NoSQL and MPP platforms, if needed. DV 2.0 has successfully been implemented on MPP RDBMS platforms like Teradata as well (ask Dan for details).

3. DV 2.0 is both “big data” and “NoSQL” ready. In fact, there are implementations where data is sourced in real-time from NoSQL databases with phenomenal success stories. One of these was presented at the WWDVC 2014 where an organization saved lots of money by using this architecture.

A near real-time case study for absorbing data from MongoDB is being presented at WWDVC2015. It’s not to be missed.

4. DV 2.0 takes advantage of MPP style platforms and is designed with MPP in mind. While DV 1.0 also did this to an extent, DV 2.0 takes it to a completely other level with a zero-dependency type architecture. Of course, there are a few caveats you will need to learn.

5. DV 2.0 lets you easily tie structured and multi-structured data together (logically) where you can join data across environments easily. This particular aspect lets you build your Data Warehouse on multiple platforms while using the most appropriate storage platform to the particular data set. It lets you build a truly distributed Data Warehouse.

6. DV 2.0 has a greater focus on agility with principles of Disciplined Agile Delivery (DAD) embedded in the architecture and approach. Again, being agile was certainly possible with DV 1.0, but it wasn’t a part of the methodology. DV 2.0 is not just “agile ready”, it’s completely agile.

7. DV 2.0 has a very strong focus on both automation and virtualization as much as possible. There are already a couple of automation tools in the market that have the Dan’s approval (just ask). Some of them will be at WWDVC15.

It’s real-time ready, cloud ready, NoSQL ready and big data friendly. And practitioners have already had success in all these areas (on real projects not just in the lab).

And, as you’ll notice on the agenda, the focus at WWDVC15 will be Data Vault 2.0 with examples of sourcing it from MongoDB, with examples of virtualization (from me!), with examples of design mods (also one from me), with examples of Hadoop implementations and more. It’s not something you want to miss, and there’s hardly any time or seats left.

If you are coming, I look forward to seeing you and chatting about the world of DW/BI and agile. If you want to attend, grab one of the last seats over at http://wwdvc.com/#tile_registration  (if there are still seats left by the time you get this message).

See you soon!

Kent

The Data Warrior

P.S. After the conference, the next place you’ll hear about DV 2.0 is in Berlin. There is a bootcamp and certification starting on 16th June at Berlin, Germany. The details are here: http://www.doerffler.com/en/data-vault-training/data-vault-2-0-boot-camp-and-certification-berlin/

Quote of the Day: We cannot direct the wind…

We cannot direct the wind…

But we can adjust the sails!

Are you heading where you want to, or letting the wind blow you about?

Talk the helm of your life and set a course.

Happy Monday!

Kent
The Data Warrior

P.S. It is almost time for the 2nd Annual World Wide Data Vault Consortium (WWDVC) in lovely Stowe, Vermont! If you do DV or want to learn more about DV, this is THE place to do it. Cost is CRAZY LOW for the content and contacts you will make. There are only a few seats left so sail on over and register at http://wwdvc.com/ before they sell out.

Post Navigation