The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “@KentGraziano”

#DataWarrior 2015 in review

Happy New Year again everybody!

Shoutout to Jeff Smith for again being the #1 real person that I actually know who referred people to my blog via his blog. If you don’t already, please add ThatJeffsSmith to your reading list.

The WordPress.com stats helper monkeys prepared a 2015 annual report for me.

Here’s an excerpt:

Madison Square Garden can seat 20,000 people for a concert. This blog was viewed about 69,000 times in 2015. If it were a concert at Madison Square Garden, it would take about 3 sold-out performances for that many people to see it.

Click here to see the complete report.

Cheers!

Kent

The Data Warrior and Snowflake Evangelist

Data Warrior Agenda for 2016

Hard to believe 2015 is almost over.

It was a very busy year for me:

All of that has entailed a lot of air miles! This year I have visited:

  1. Denver (several times!)
  2. Salida, Colorado
  3. Hollywood, Florida,
  4. Raleigh, NC
  5. Charlotte, NC (thanks to Lynn Winterboer for that one!)
  6. San Francisco
  7. Redwood City, California (Oracle HQ)
  8. Austin (drove this one)
  9. Minneapolis/ St Paul (thanks to Redpill Analytics mostly)
  10. Kansas City, Missouri
  11. Portland, Maine
  12. St Albans, Vermont
  13. Stowe, Vermont
  14. San Mateo, California (HQ for Snowflake Computing)

And that was just work related! Family trips took me to:

  1. Galveston (beach!)
  2. South Padre Island, TX (more beach!)
  3. Road trip to Central NY:
    1. Joplin, Missouri
    2. Hannibal, Missouri (Mark Twain museum)
    3. Chicago (to see robots at the Museum of Science and Industry)
    4. Sandusky, Ohio (just to sleep)
    5. Fulton, NY (to see my dad)
    6. Old Forge, NY (summer vacation in the mountains!)
    7. Huntsville, Alabama (NASA Rocket Center!)
  4. Who knows – the year is not over yet!

Speaking in 2016

2016 will be very busy with the new job for sure. I am already booked for a bunch of events. Here they are so far:

Data Day Texas – January 16 in Austin, TX

TDWI Webinar – Dymstyfying Elastic Data Warehousing (with Philip Russom) – January 26th

BIWA Summit – January 26-28 at Oracle HQ

RMOUG Training Days 2016 – Febuary 9-11 in Denver, CO (I have 2 hour deep dive on Feb 9th). Register early for discounts.

Enterprise Data World – April 17-22 in San Diego. Register early for discounts (by the end of the year for the best rate).

ODTUG KScope16 – June 26-30 in Chicago, IL. Register early and be sure to book the hotel!

Also likely speaking at World Wide Data Vault Consortium (WWDVC) – May 25-28 in Stowe, Vermont (TBD)

And many more to come! (watch my twitter feed for updates)

Hopefully I will see you at one or more of these events!

Wishing a safe a joyous holiday season!

Merry Christmas & Happy New Year!

Kent

The Data Warrior


 

Tech Tip: Connect to Snowflake db using #SQLDevModeler

So, some of you may have noticed that I took “real” job this week. I am now the Senior Technical Evangelist for a cool startup company called Snowflake Computing.

Basically we provide a data warehouse database as a service in the cloud.

Pretty cool stuff. (If you want to know more, check out our site at snowflake.net)

I will talk more about the coolness of Snowflake (pun intended) in the future, but for now I just want to show you how easy it is to connect to.

Of course the first thing I want to do when I meet a new database is see if I can connect my most favorite data modeling tool, Oracle SQL Developer Data Modeler (SDDM),  to it and reverse engineer some tables.

The folks here told me that tools like Informatica, MicroStrategy, and Tableau connect just fine using either JDBC or ODBC, and that since we are ANSI SQL compliant, there should be no problem.

And they were right. It was almost as easy as connecting to Oracle but it was WAY easier than connecting to SQL Server.

First you need a login to a Snowflake database. No problem here. Since I am an employee, I do get a login. Check.

We have both a web-UI and a desktop command line tool. Turned out I needed the command line tool which incidentally needed our Snowflake JDBC connector to work. Followed the Snowflake documentation, downloaded the JDBC drive (to my new Mac!). Piece of cake.

So connecting from SDDM is really easy. First add the 3rd party JDBC driver in preferences. Preferences ->Data Modeler -> Third Party JDBC Driver (press the green + sign, then browse to the driver).

Add JDBC Driver

As you can see our JDBC driver is conveniently named snowflake_jdbc.jar.

Next step is to configure the database connection. To do this you go to File -> Import -> Data Dictionary, then add a new connection in the wizard.

Configure Connection

Give at a name and login information, then go to the JDBC tab.

So getting the URL was the trick (for me anyway). Luckily the command line tool displayed the URL when I launched it in a terminal window, so I just copied it from there (totally wild guess on my part).

So the URL (for future reference) is:

jdbc:snowflake://sfcsandbox.snowflakecomputing.com:443/?account=<service name>&user=<account>&ssl=on

Where account is whatever you named your account in Snowflake (once you have one of your very own that is).

The driver class was a little trickier – I had to read our documentation! Thankfully it is very good and has an entire section on how to connect using JDBC. In there I found the drive class name:

com.snowflake.client.jdbc.SnowflakeDriver

That was it.

I pushed the Test button and success!

Now to really test it, I did the typical reverse engineer and was able to see the demo schema and tables and brought them all in.

Snowflake Schema

Demo schema in Snowflake (no, not a snowflake schema!)

So I call that a win.

Not a bad weeks work really:

  1. New job orientation
  2. Start learning a new tech and the “cloud”
  3. Got logged in
  4. Installed SDDM on a Mac for the 1st time ever!
  5. Configured to speak to an “alien” database
  6. Successfully reverse engineer a schema
  7. Blog about it.

So that was my 1st week a a Senior Technical Evangelist.

TGIF!

Kent

still, The Data Warrior

P.S. If you want to see more about my week, just check my twitter stream and start following @SnowflakeDB too.

 

 

The Hills were Alive with the Sounds of #DataVault

Yes folks a few weeks back we held the 2nd Annual World Wide Data Vault Consortium (#WWDVC) at the lovely Trapp Family Lodge outside Stowe, Vermont. What a great venue! Beautiful scenery, near perfect weather, great food, and great beer (they have their own brewery). Standing on the hillside it is easy to see why the Von Trapp Family Singers (you know from “The Sound of Music”) decided to settle here to build their new life in America.

What a view!

What a view!

Of course the learning and networking were outstanding again. This year was even better than last year.

Why?

  1. Location, location, location
  2. It was May (so much warmer that last year in St Albans – brrrr.)
  3. Dr. Claudia Imhoff gave the keynote! I love her new concept #XDW – the Extended Data Warehouse.
  4. Scott Ambler talked about Agile DW! It takes a Disciplined approach to be agile.
  5. Dan talked about DV 2.0 and Big Data.
  6. Sanjay showed us how he built a DV 2.0 platform on Hadoop.
  7. Multiple, real world case studies of DV 2.0 working in the wild around the globe.
  8. I gave two talks and showed models and code from one of my recent adventures.
  9. Five members of the Boulder BI Brain Trust (#BBBT) in attendance.
  10. We had multiple 30-minute networking sessions between the talks (who does that?). Plenty of time to ask questions and get to know each other.
  11. Three (count ’em 3!) global software vendors with off the shelf tools that support the automatic generation of DV 2.0 compliant components. Wow!
  12. BBQ dinner hosted by AnalytixDS. Yum!
  13. Crazy shirt day and contest.
  14. And did I mention three days of face-to-face networking with world-renowned experts. (I got to have lunch with Claudia Imhoff AND Scott Amber at the same time – a once in a lifetime opportunity)
  15. Fresh German-style craft beer.
  16. Bavarian pastries from the in house bakery.
  17. Did I mention the food?
  18. The view.
  19. The hiking. (Good to get outside and exercise after all those sessions.)
  20. The mountain biking (after the conference of course).

As if that was not enough, I was privileged to attend an exclusive workshop/mentoring/Q&A session with Dan the day before the event, where he told us about new, as yet unpublished DV 2.0 additions, explained in depth the zero-key concept, the right way to use hash keys, 3 stages of managed self-service BI, and a host of other topics and issues we all wanted feedback on. My brain was tired before the conference even started.

Hint: if you want to get invited to that special session next year, you need to get DV 2.0 certified ASAP. Keep an eye on LearnDataVault for Dan’s teaching and speaking schedule and locations or contact me about setting up a class if you can’t make one of his (I am an authorized DV 2.0 Bootcamp Instructor too).

Bummed out now that you missed all this great learning? Not like I did not warn you!

Well first, you can catch a lot of the action and a bunch of pictures by mining the Twitter stream for #WWDVC. But since I know you are all too busy (or lazy?), here it is for you:

Really wish you were there? Really?

You are in luck because Dan managed to record some of the best session on video! The videos and all the PowerPoint presentations are now available, for sale, on the Data Vault learning site. Just check out this offering WWDVC 2015 Videos. In addition to the videos listed, you get all the other presentation materials from the speakers (including me).

Right now the cost is $499 (yup more than the conference but hey, no travel expense). Since you are a loyal reader of my blog, you can get a 20% discount off that by using the coupon code KENT10S during checkout.

Even without the discount, it is more than worth the money. The video of Claudia’s keynote and Scott Ambler’s talk are worth that much alone.  The videos are high quality and both of them are amazing speakers. (FYI – some of the videos are very long and may take a minute or two to load depending on your internet connection)

So that is my short review of WWDVC 2015. Glad I was able to be a part of this great event!

VonTrappLodge2Keep you eyes on http://wwdvc.com/ for the announcement of the 2016 event and the call for papers (which will open soon).

See you next year? (Somewhere near Stowe again)

Kent

The Data Warrior

P.S. Dan’s newest book that covers Data Vault 2.0 is now available for pre-order on Amazon. Get a preview of Dan’s new DV 2.0 book.

Better Data Modeling: 7 Differentiating Characteristics of Data Vault 2.0

Hard to believe that the 2nd Annual World Wide Data Vault Consortium (WWDVC15) is NEXT WEEK in beautiful Stowe Vermont. It promises to be an excellent event. The speakers include myself, Claudia Imhoff, Dan Linstedt (the inventor of Data Vault), Scott Ambler, Roelant Vos, Dirk Lerner and many more. The focus will be DV 2.0, agile data warehousing, big data, NoSQL, virtualization and automation. Check out the agenda here: http://wwdvc.com/schedule/

So in preparation (and to encourage you to attend), I thought it might be good to review some of the important basics about Data Vault 2.0 and why it is an important evolution for the data warehousing community.

The approach started out as the Common Foundational Warehouse Modeling Architecture as it’s official name. Then it was more commonly known as the “Data Vault” and became a modelling method for Data Warehouses. It also had a methodology with implementation guidelines and worked very, very well on relational platforms for many, many years (over 10 years for those who did not know).

But technology evolved. NoSQL architectures came into the picture primarily as sources. The Apache Hadoop platform started offering a cheaper storage and processing MPP architecture.

Data Vault evolved into Data Vault 2.0 and already has many successful implementations. The original Data Vault is now referred to as Data Vault 1.0 (or DV 1.0) and it primarily has a modelling focus. DV 2.0 on the other hand changes some things, and adds a LOT.

Data Vault 2.0 has the following 7 differing characteristics:

1. DV 2.0 is a complete system of Business Intelligence. It talks about everything from concept to delivery. While DV 1.0 had a major focus on modelling and many of the modelling concepts are similar, DV 2.0 goes a step further and talks about data from source to business user facing constructs with guidelines for implementation, agile, virtualization and more.

2. DV 2.0 can adapt to changes better than pretty much ANY other data warehouse architecture or framework. It can do it even better than DV 1.0 because of the change in design to adapt to NoSQL and MPP platforms, if needed. DV 2.0 has successfully been implemented on MPP RDBMS platforms like Teradata as well (ask Dan for details).

3. DV 2.0 is both “big data” and “NoSQL” ready. In fact, there are implementations where data is sourced in real-time from NoSQL databases with phenomenal success stories. One of these was presented at the WWDVC 2014 where an organization saved lots of money by using this architecture.

A near real-time case study for absorbing data from MongoDB is being presented at WWDVC2015. It’s not to be missed.

4. DV 2.0 takes advantage of MPP style platforms and is designed with MPP in mind. While DV 1.0 also did this to an extent, DV 2.0 takes it to a completely other level with a zero-dependency type architecture. Of course, there are a few caveats you will need to learn.

5. DV 2.0 lets you easily tie structured and multi-structured data together (logically) where you can join data across environments easily. This particular aspect lets you build your Data Warehouse on multiple platforms while using the most appropriate storage platform to the particular data set. It lets you build a truly distributed Data Warehouse.

6. DV 2.0 has a greater focus on agility with principles of Disciplined Agile Delivery (DAD) embedded in the architecture and approach. Again, being agile was certainly possible with DV 1.0, but it wasn’t a part of the methodology. DV 2.0 is not just “agile ready”, it’s completely agile.

7. DV 2.0 has a very strong focus on both automation and virtualization as much as possible. There are already a couple of automation tools in the market that have the Dan’s approval (just ask). Some of them will be at WWDVC15.

It’s real-time ready, cloud ready, NoSQL ready and big data friendly. And practitioners have already had success in all these areas (on real projects not just in the lab).

And, as you’ll notice on the agenda, the focus at WWDVC15 will be Data Vault 2.0 with examples of sourcing it from MongoDB, with examples of virtualization (from me!), with examples of design mods (also one from me), with examples of Hadoop implementations and more. It’s not something you want to miss, and there’s hardly any time or seats left.

If you are coming, I look forward to seeing you and chatting about the world of DW/BI and agile. If you want to attend, grab one of the last seats over at http://wwdvc.com/#tile_registration  (if there are still seats left by the time you get this message).

See you soon!

Kent

The Data Warrior

P.S. After the conference, the next place you’ll hear about DV 2.0 is in Berlin. There is a bootcamp and certification starting on 16th June at Berlin, Germany. The details are here: http://www.doerffler.com/en/data-vault-training/data-vault-2-0-boot-camp-and-certification-berlin/

Post Navigation

%d bloggers like this: