The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “Data Warehouse”

Cloud Data Warehousing: Is it for real?

 

Our industry is full of hype and hyped terms.

Big Data. NoSQL. The Cloud. Self-service <whatever>.

And Cloud Data Warehousing.

Some of the offerings and solutions are real. Some less so.

Newest on the scene is cloud data warehousing (or data warehousing in the cloud). As with all new tech, there are a variety of offerings out there with different characteristics. To help folks try to understand the space a bit more, the company I work for (Snowflake Computing) put together a (hopefully) hype-free, vendor agnostic book on the topic called Cloud Data Warehousing for Dummies, which I blogged about last month. If you have not already gotten a copy and read it, I encourage you to do so soon. I think you will find it very helpful in the coming months as this topic heats up.

It is where data warehousing is going. Period.

But is Cloud Data Warehousing really for real?

I may be biased here (okay, likely), but based on my experience working with Snowflake for over a year now, I have to say yes. Emphatically, yes!

Cloud Data Warehousing is real. It can handle real data and real workloads. To the tune of hundreds of terabytes and even petabytes of structured, and semi-structured, data, all for a fraction of the cost of traditional on-premises data warehouse solutions, and with the ease of administration you expect from a cloud-based SaaS solution.

But, as they say, the proof is in the pudding!

So here are a few proof-points for you from real, live customers, who have been using Snowflake to improve their business outcomes.

AthenaHealth

AthenaHealth is a leading healthcare services provider (with a network of 85,000 providers and 83 million patients nationwide). So yes, it is possible to have a cloud data warehouse that is secure enough to pass HIPAA regulations for holding PHI (Personal Healthcare Information).

In this video, Adam Weinstein, Executive Director of Analytics & Data Science explains how AthenaHealth leverages the Snowflake Cloud Data Warehouse service to radically accelerate their reporting with real-time updates, more advanced analytics, and machine-learning, while minimizing overhead and maintenance.

Some of the key benefits AthenaHelth experienced using Snowflake:

  • Ability to work with petabytes of healthcare data
  • Ability to scale to meet analytic needs both internally and externally
  • Lower total cost of ownership (TCO) than other options
  • Ability to support machine learning-based products
  • Reduction in overhead maintenance thanks to the Snowflake service offering

Says Adam:

What I see Snowflake enabling us to deliver to our clients, internal stakeholders and paying customers will be pretty freaking cool!

Iovation

Iovation is the leading SaaS provider of fraud prevention and multifactor authentication solutions. So needless to say, they know security and they feel very secure with their data in the cloud.

In this video, Kurk Spendlove, Director of Engineering, shares why they switched from Vertica to the Snowflake Cloud Data Warehouse service in order to load semi-structured data directly into the cloud data warehouse and analyze years of data in a matter of minutes.

Some of the key benefits Iovation experienced using Snowflake:

  • Ability to load semi-structured data directly into Snowflake
  • Loading schema-less data – not having to modify schema every time data is changing in new weekly releases
  • Ability to scan through years’ worth of data and having the report back in minutes
  • Powerful support for new machine learning-based products
  • Minimize management for data warehouse and overhead

Kurk says:

I’m a big fan of Snowflake and the people behind it.

Rue La La

Rue La La is a flash sale site with over 18 million members looking for great deals on designer fashion and accessories.

Director of BI and Data Warehousing at Rue La La, Erick Roesch says:

Snowflake’s separation of compute and storage is just revolutionary!

In this video, he explains how they replaced their legacy data warehouse and Hadoop data lake with a Snowflake Cloud Data Warehouse to merge data sources for fast, data-driven business decisions.

Key benefits Rue La La saw from switching to Snowflake:

  • Merge different data sources for data-driven insights- 360-view of their customers!
  • Better targeted marketing and promotions to Rue La La members based on their personalized preferences
  • Better purchasing decisions for Merchandising and planning dept – they can learn more about context of the product, avoid having residual inventory of things that don’t sell
  • All data in one place in real time– internal and external data feeds (demographic, census, geo-location data)
  • No admin and infrastructure costs
  • Streamlined development cycles -traditional development activities and processes become very simple

Sharethrough

Sharethrough is the leading global native advertising (adtech) platform. In this short video listen to the Head of Analytics, Joseph Bates, explain how they were able to drastically reduce query times, streamline complex processes, and build new data pipelines by switching from MySQL to the Snowflake Cloud Data Warehouse.

Some key benefits Sharethrough saw from using Snowflake:

  • Reduced query times from hours to seconds (before, basic queries took an hour to return)
  • Streamline complex processes with minimal cost
  • “Query that used to take an entire weekend & $1,200 of compute time to run, now in Snowflake runs with bare minimum ETL, 4 lines of SQL in 30 seconds.”
  • Minimal database administration

Joseph’s conclusion:

The next step will be to see how we can build new data pipelines and meet the demands of our business, and I think Snowflake is unparalleled in this regard.

Cloud Data Warehousing is not just hype

Hopefully you can see by the passion and excitement from these customers, that it is not all hype. The promise of the cloud combined with a next-generation SQL-based data warehouse engine is in fact delivering the goods.

I am even more excited about the possibilities now than when I joined a year ago. It is awesome to see what these, and other companies are doing to transform their businesses and really challenging the status quo of in not only the data warehousing arena, but big data as well.

Cloud data warehousing is a game changer.

Maybe we can have it all?

If this tech excites you too, please share on social media with any and all who love data and want to change the story for enterprise data warehousing! And don’t forget to follow Snowflake on twitter @snowflakedb for more customer success stories, upcoming webinars, and product announcements.

 

Kent

The Data Warrior

Cloud Data Warehousing for Dummies

As we all know, cloud is the big thing these days. Getting bigger everyday it seems.

It may get even big than Big Data!

If you, like me, are a data warehousing or BI professional, you have probably been wondering how this all fits in the cloud world. You may have even heard of data warehousing  “in the cloud”.

But what does that really mean? What is a cloud data warehouse?

Well thanks to Snowflake Computing, it just got a little easier to answer this question.

They sponsored the development of a new book called Cloud Data Warehousing for Dummies. Yup, an actual Dummies guide for this. And yes, yours truely, got to have a hand in editing and writing the book.

And the best part – it is FREE!

clouddw_dummies

Researching and helping to write the book was very educational for me. I learned a lot in the process about what constitutes a cloud data warehouse, the difference between a platform in the cloud and a real service in the cloud, and what characteristics folks should look for when choosing one.

I also learned to say “on-premises” instead of “on-premise.” 🙂

Content

The chapters of the book cover:

  • An introduction to cloud data warehousing
  • Why the modern data warehouse emerged
  • The criteria for selecting a modern data warehouse
  • On-premises vs cloud data warehousing
  • Comparing cloud data warehousing solutions
  • A six-step guide to choosing a cloud data warehouse

It also includes several real-world customer case studies.

Even though Snowflake sponsored the book, it is vendor agnostic. It really is a book designed to get you introduced to the concepts and to get you thinking about what you might want in a cloud-based data warehousing system.

It is ideal for anyone who is considering making that transition to the cloud.

So head on over to this site and download your FREE copy today!

To infinity and beyond!

Kent

The Data Warrior (with his head in the clouds)

P.S. Forward this to a friend so they can download a copy too!

 

 

Top 3 Tips for Staying Current in the Evolving World of Data Warehousing

The world of data warehousing and analytics has changed! With the advent of Big Data, Streaming Data, IoT, and The Cloud, what is a modern data warehousing professional to do? It may seem to be a very different world with different concepts, terms, and techniques. Or is it?

This is a question I ask myself all the time. So how do you keep up?

Here is what I do:

1 – Follow the Leaders

Yes, social media! Mostly, I use Twitter. I follow the industry thought leaders and analysts like Claudia Imhoff, Tamara Dull, Howard Dresner, Philip Russom, Cindi Howson, and many others. Not only do I see what they are thinking (and speaking) about, but I get to see what they are reading.

2 – Meet the Leaders

While reading books and online articles is great, there is nothing that replaces face to face communication. And the best way to do that is attend educational events where they are speaking. These days that could mean everything from local meet-ups, to regional conferences (like RMOUG), vendor roadshows, and larger annual events (like the recent Oracle OpenWorld).

For meet-ups, simply go to https://www.meetup.com/ and sign up (for free). You can search for meet-ups in your local area by topic. You may be surprised how many there are nearby and how often they have event. This is a great way to network with other professional in your local community.

To learn from the industry leaders, look to larger national and international events. In the data warehousing and analytics world that means groups like The Data Warehouse Institute (TDWI). They have local chapters and run larger national events on a regular basis (the next one is in October in San Diego). Another group I am associated with is DAMA International which also sponsors local chapters, national and international events.

And of course your vendors and solution providers may run their own events, like the Snowflake Cloud Analytics city tour.

3- Be a Leader

Volunteer! Yes by getting involved with these meet-ups, associations, and user groups, whether locally or nationally, you not only get to give back to the community, but you will often benefit by getting to know and speak with leaders one on one in a less formal environment.

Start off small by helping organize a meeting, or getting the refreshments. Help with the web site or the mail list. If the group you choose runs a conference, help with the paper selection process (you will learn a ton reading the abstracts). And then, when you are ready, become a speaker yourself. There is no better way to learn than to try to teach what you know to someone else.

I have been helping with user group conferences and events for nearly 30 years now and have never regretted a minute of the time spent.

 

So those are my top 3 tips for how you can stay fresh and informed and ahead of the game in this crazy world of data warehousing, big data, and the cloud.

Seems to be working for me.

Keep Learning!

Kent

The Data Warrior

P.S. One of our Snowflake customers, IAC Publishing Labs (owners of Ask.com), won the TDWI Best Practice award for the Emerging Technologies and Methods category and Keith Lavery will be speaking about the project at the TDWI Executive Summit in San Diego on Monday, October 3rd.

P.P.S.  And don’t forget to follow some of the leaders at Snowflake like @bob_muglia and @jonb_snowflake.

 

Tech Tip: Quick Start for Getting Your Data into Snowflake

From my most recent blog about @SnowflakeDB:

If you are like me and fairly new to this whole cloud thing, then one of your main questions is likely:

“How do I get data from my desktop (or server) into Snowflake so I can query it?”

Which, in reality, translates to:

“How do I load data in the cloud?”

Read the rest of the post to see how: Tech Tip: Quick Start for Getting Your Data into Snowflake

Happy Data Loading!

Kent

The Data Warrior

snowflakedifferencescreenshot

Maintaining disabled FK’s, wisdom or farce?

A while back, I wrote a post about having FKs (foreign keys) in your data warehouse.

Well, a similar question came up recently on an Oracle forum with the above title. It is a fair question and it does surface fairly regularly in a variety of contexts (not just data warehousing).

Of course, as The Data Warrior, I felt is was my duty to respond.

The Question

Is there any reason to maintain a permanently disabled FK in the data model?  I’m not envisioning a reason to do it.  If it is not going to be enabled, then from my perspective, it would not make any sense to have it defined.  If anything, provide the definition of the relationship in the comment of the child column.

My Answer

Yes, by all means keep the FK please!

I see three good reasons for doing so:

  1. It is valuable metadata (& documentation). If somebody reverse engineers the database (say with ERWin or Oracle Data Modeler), the FK shows up in the diagram (way better than having to read a column comment to find out)
    Data Vault 2.0 Example

    A picture is worth a thousand words!

    .

  2. BI Metadata – If you want to use any sort of reporting or BI tool against the database, most tools will import the FK definition with the tables and build the proper join conditions. Way better than having someone guess what the join will be and then manually adding it to the metadata layer in the reporting tool. Examples that can read the Oracle data dictionary include OBIEE, Business Objects, COGNOS, Looker, and many others.(Note here that since the FK is not enforced on the remote databases, you might want to make sure these are treated as outer joins, lest you lose some transaction in the reports).
  3. The Oracle optimizer will use disabled constraints to improve query performance of joins. Again, this is metadata in the data dictionary which the optimizer can read. This is documented in the Oracle Data Warehouse guide and I have validated it on multiple occasions with Oracle product management.

While #3 applies specifically to Oracle, for other databases like MS SQL Server and Snowflake, #1 and #2 still apply.

Even if only one of the above is true for a given database, that, in my opinion, still justifies keeping the disabled constraint around.

Final Answer = Wisdom

What do you think? Feel free to comment below.

And please share on your favorite social media platform!

Model on!

Kent

The Data Warrior

 

Post Navigation

%d bloggers like this: