The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “data warehousing”

Best #DataVault Event Ever!

Hard to believe it is less than 25 days to WWDVC 20016. I’m going. Are you?

If not, why not?

The 3rd Annual World Wide Data Vault Consortium is going to be epic!

What is the WWDVC?

It’s the only conference that focuses on everything happening in the Data Vault world, and this time, the keynote speaker is the father of Data Warehousing himself Bill Inmon. He created the industry in which we earn our daily bread.

The sessions start prior to the conference with a closed room meeting of Data Vault certified brainstorming. I will be there for sure to talk about Data Vault in the Cloud and of course experiences in virtualizing layers of the data vault architecture. I can hardly wait to discuss what I have learned this year with the other data vault brainiacs.

This is followed by a day of hands on workshops:

  • How to build a DV, a Data Mart, and End User Analytics from Scratch in 3 hours

  • Extend the power of the Data Vault to a real-time use case using a Spark based Lambda architecture

  • Generate code using the award winning Mapping Manager tool

And, these are all before the conference even starts.

There’s a lot more during the conference including the famous networking sessions which this event is known for. If you’ve already booked your seat, see you there. If not, what are you waiting for?

For something so valuable, you would expect a hefty price tag like the vendors do it.

Nope, it’s dirt cheap.

Check it out for yourself here:

WWDVC Registration

It’s a conference for Data Warehouse people with a Data Vault focus. Folks like you. It does have many things for the business person as well such as selling the Data Vault concept to the business owner (Peter Aiken), case studies including implementation of an Near Real Time DV 2.0 in the cloud, and a customer centric analytics case study, and more.

Give Your Brain a Treat

Some of the smartest people in Data Warehousing I know, have been in attendance in the past. This is a once a year chance to meet them in person, increase your network, and have a great time

This time it appears to have been kicked up a notch with the likes of John Giles (Universal Data Vault), Peter Aiken (In one instance their DV 2.0 helped save a client a whopping $25M/year), Bill Inmon (The father of the DW), Dale Anderson (Used the DV to build one of the world’s first DaaS).

There are also the repeat attendees like Mary Mink and Sam Bendayan (Used DV 2.0 to implement an NRT DW sourced from a NoSQL platform serving 3000+ customers), Michael Olschimke (Co-author of the new DV 2.0 book), Roelant Vos (Not only did he build his own automation, he likes to experiment … with architecture).

And … vendors of automation tools. Several of the world’s leading Data Warehouse Automation tools will not only be there, they are sponsoring (and buying dinner and drinks!).

And as if that is not enough, you will get to rub elbows with Dan Linstedt, the inventor of Data Vault, Sanjay (co-founder of LearnDataVault.com), and of course me, Kent, The Data Warrior.

So what are you waiting for? Go buy your ticket to the conference, book your room, and figure out how you will get there (hitch hike if you have to).

See you in Vermont!

Kent

The Data Warrior

P.S. Don’t forget Saturday at WWDVC is crazy shirt day.

P.P.S. The company I work for, Snowflake Computing, is a sponsor and is giving me a GoPro! to raffle off, so, don’t miss my talk on Thursday if you want in on the raffle.

 

 

Advertisements

A Snow Storm of Snowflake Webinars

Good Monday Morning!

Been itching to learn more about the Snowflake Elastic Data Warehouse? Well, now is your chance.

Over the next two weeks we have a bunch of great webinars coming up so I figured I should just give you a an easy list to review with links to sign up. Here it is:

WEBINAR #1

Wednesday, 04/27/2016 10am PT 

CapSpecialty: Leveraging data to deliver faster business results linked to Key Performance Indicators 

Abstract:

CapSpecialty is upping its game to become the preferred provider of specialty insurance products using MicroStrategy Analytics and Snowflake Cloud Data Warehousing.

Featured partner: MicroStrategy

Hosted by: MicroStrategy

Featured Customer: CapSpecialty

Register here! 

WEBINAR #2

Wednesday, 04/27/2016 11am PT

4 Big Data Strategies You Can’t Go Without 

Abstract:

You’ve got questions about big data, our panel has answers.

When it comes to customer relationships, big data can usher in big opportunities or big problems. That’s why it’s vital for organizations to take a strategic approach to big data. 

They must clean their data, integrate it, maximize data value, comply with security and governance requirements, and make sure the right people have the right access to the data at the right times.

Media partner: CRM Magazine

Hosted by: CRM Magazine 

Featured partner: Informatica + Looker

Featured use case: Pitney Bowes

Featured presenter: Kent Graziano, The Data Warrior

Register here! 

WEBINAR #3

Thursday, 04/28/2016 10am PT

Using the Cloud For Speed-of-Thought Analytics on All Your Data

1.5 TB of data per day? No problem! Learn how Ask.com turned to Snowflake’s cloud-native data warehouse combined with Tableau’s data visualization solution to address their challenges.

Featured partner: Tableau

Hosted by: Snowflake

Featured Customer: Ask.com

Featured presenter: Jon Bock, VP Product and Marketing, Snowflake

Register here!

WEBINAR #4

Thursday, 05/05/2016  10am PT 

The Right Choice: Why Spark + a Cloud Data Warehouse = Success 

The first rule of data analytics for fast-growing companies? Measure all things. When putting in place a robust data analytics strategy to go from measurement to insight, you’ve got lots of options for tools — from databases and data warehouse options to new “big data” tools such as Hadoop, Spark, and their related components. But tools are nothing if you don’t know how to put them to use. 

Media partner: VentureBeat

Hosted by: VentureBeat

Featured Customer: Celtra

Featured presenter: Jon Bock, VP Product and Marketing, Snowflake

Register here!

The Data Warrior Live in Chicago!

Later this week on Thursday April 28th, I will be speaking about Data Vault and Agile Data Engineering at a special Snowflake half-day workshop in downtown Chicago. You can sign up for that here.

So, no excuse for not learning more about Snowflake in the coming weeks. Sogn up for one or more of these events today.

Have a good week!

Kent

The Data Warrior

4 Keys to Succeeding with Agile Data Warehousing in 2016

I have been out giving talks again on using agile methods for data warehouse and business intelligence projects, so I thought it was time for me to share my thoughts about the 4 key elements you need to be successful with an Agile DW project in 2016.

Adopt an Agile Methodology

By this I am talking about SCRUM, Kanban, ScrumBan, or DAD (Disciplined Agile Development), among others.

Go read the blogs, read the books, study these methods. Attend a conference (like Agile Tech in April). Figure out what will work for your organization’s culture and leverage the skills of your staff. One size does not fit all.

In past engagements I have used approaches primarily based on SCRUM and Kanban. Both have been very effective once we got our processes down.

If you need/want help, find a good agile coach.

Use an Agile Data Engineering Approach

If you want to develop your data warehouse in an agile, iterative manner, then you need a way to design your EDW repository that lends itself to this approach without causing huge re-engineering pains (known as refactoring) in future iterations.

The best way I have found is using the Data Vault modeling approach. It was designed specifically for building data warehouses in this manner. I have written much about this approach and give many talks showing examples of successful agile projects using Data Vault. And there is plenty of material available to help you learn how to do it (see the books on the sidebar of this blog).

Also keep an eye on Dan Linstedt’s twitter feed and blog for his training classes.

Use Data Warehouse Automation Software

No better way to get agile and deliver results fast, than to automate as much of your development work as possible. If you use repeatable patterns (like Data Vault) in your design methodology, then it is even easier to automate and greatly reduce your time to market.

There are two vendors in the market that I like a lot and have had some experience with. They are WhereScape and AnalytixDS. And both support not only “traditional” approaches to data warehousing (like automating the ETL for a Type 2 Slowly Changing Dimension) but they both also support Data Vault (and both will be at WWDVC 2016).

Which of these tools you might use depends on your approach, your current tools, and your skills.

If you are coming from a more traditional DW paradigm and use ETL tools like Informatica, Talend, or DataStage, then I would recommend you look at AnalytixDS Mapping Manager which allows you to generate your ETL code from source to target mappings.

If you are just getting started or are committed to more of a database-centric approach and want your ETL or ELT code to run in the database, then look at WhereScape’s products.

Both are great companies with knowledgable people and happy customers.

Your third option is to write your own automation routines. There are many shops doing that as well. Just be sure you have the appropriate skills in house and can allocate the upfront time to get going (a month or so at least).

Deploy on an Agile Data Warehouse Platform

So now that I have learned about Elastic Data Warehousing in the cloud, I can’t imagine trying to do an agile DW project any other way.

Of course I am referring to Snowflake Computing’s DWaaS (data warehouse as a service) offering. Yes, I might be a bit biased since I do work for them now, but…this tech is really good!

From a features perspective, what I am talking about is having a high powered, easily scalable database that supports BI and analytic workloads and does not require a ton of time to configure and tweak.

Why do I think that is a success criteria? Because I have spent way too many months on way too many “agile” projects waiting to get access to the hardware! Or I get access and we either run out of space (e.g., “we had no idea you need THAT much storage”) or we can’t properly test production level loads and queries because the development box does not have enough horsepower.

Taking advantage of the elasticity of the cloud solves both of these problems and the folks at Snowflake have successfully built an RDBMS in the cloud that specifically harnesses these features and leverages them for data warehouse and analytic workloads by providing the ability to scale up and scale down both storage and compute resources on demand.

That and its many other features, give me the infrastructure I need to get an agile data warehouse project off the ground almost instantly. And I can do a Data Vault on Snowflake too.

Very cool.

So what do you think? Are you ready to accelerate your team’s performance and adopt an agile approach to data warehousing?

I hope this post gives you a few ideas on how to make that happen.

Model on!

Kent

The Data Warrior

 

Better Data Modeling: Customizing Oracle Sql Developer Data Modeler (#SQLDevModeler) to Support Custom Data Types

On a recent customer call (for Snowflake), the data architects were asking if Snowflake provided a data model diagramming tool to design and generate data warehouse tables or to view a data model of an existing Snowflake data warehouse. Or if we knew of any that would work with Snowflake.

Well, we do not provide one of our own – our service is the Snowflake Elastic Data Warehouse (#ElasticDW).

The good news is that there are data modeling tools in the broader ecosystem that you can of course use (since we are ANSI SQL compliant).

If you have read my previous posts on using JSON within the Snowflake, you also know that we have a new data type called VARIANT for storing semi structured data like JSON, AVRO, and XML.

In this post I will bring it together and show you the steps to customize SDDM to allow you to model and generate table DDL that contain columns that use the VARIANT data type.

Read the details of how I did it here on my Snowflake blog:

Snowflake SQL: Customizing Oracle Sql Developer Data Modeler (SDDM) to Support Snowflake VARIANT – Snowflake

Enjoy!

Kent

The Data Warrior

P.S. If you are in Austin, Texas this weekend, I will be speaking at Data Day Texas (#DDTX16). Snowflake will have a booth there too, so come on by and say howdy!

Snowflake SQL: Making Schema-on-Read a Reality (Part 1) 

This is my 1st official post on the Snowflake blog in my new role as their Technical Evangelist. It discusses getting results from semi-structured JSON data using our extensions to ANSI SQL.

Schema? I don’t need no stinking schema!

Over the last several years, I have heard this phrase schema-on-read used to explain the benefit of loading semi-structured data into a Big Data platform like Hadoop. The idea being you could delay data modeling and schema design until long after the data was loaded (so as to not slow down getting your data while waiting for those darn data modelers).

Every time I heard it, I thought (and sometimes said) – “but that implies there is a knowable schema.”  So really you are just delaying the inevitable need to understand the structure in order to derive some business value from that data. Pay me now or pay me later.

Why delay the pain?

Check out the rest of the post here:

Snowflake SQL: Making Schema-on-Read a Reality (Part 1) – Snowflake

Enjoy!

Kent

The Data Warrior

Post Navigation

%d bloggers like this: