Snowflake Computing is making great strides in the evolution of our Elastic DWaaS in the cloud. Here is a recent update from engineering and product management on our integration with Spark:
This is the first post in an ongoing series describing Snowflake’s integration with Spark. In this post, we introduce the Snowflake Connector for Spark (package available from Maven Central or Spark Packages, source code in Github) and make the case for using it to bring Spark and Snowflake together to power your data-driven solutions.
Next up on the Data Warrior speaking schedule is the St. Louis SilverLinings event on May 2nd. It will be held at the St. Charles Convention Center, St. Louis, MS.
This promises to be a very exciting event boasting “edgy” and forward looking technical topics. It’s going to be a very busy day for me with three talks in total on some of my favorite topics.
Hey fellow data warriors! Here is a new joint blog post I just did with fellow data warrior Dale Anderson from Talend! Check it out. I hope you find the concept compelling!
So you want to build a Data Lake? Ok, sure let’s talk about that. Perhaps you think a Data Lake will eliminate the need for a Data Warehouse and all your business users will merely lure business analytics from it easily. Maybe you think putting everything into Big Data technologies like Hadoop will resolve all your data challenges and deliver fast data processing with Spark delivering cool Machine Learning insights that magically give you a competitive edge. And really, with NoSQL, nobody needs a data model anymore, right?
Avoid the data swamp! Use modern cloud based DWaaS (Snowflake) and the leading-edge Data Integration tool (Talend) to build a Governed Data Lake.
Big Data. NoSQL. The Cloud. Self-service<whatever>.
And Cloud Data Warehousing.
Some of the offerings and solutions are real. Some less so.
Newest on the scene is cloud data warehousing (or data warehousing in the cloud). As with all new tech, there are a variety of offerings out there with different characteristics. To help folks try to understand the space a bit more, the company I work for (Snowflake Computing) put together a (hopefully) hype-free, vendor agnostic book on the topic called Cloud Data Warehousing for Dummies, which I blogged about last month. If you have not already gotten a copy and read it, I encourage you to do so soon. I think you will find it very helpful in the coming months as this topic heats up.
It is where data warehousing is going. Period.
But is Cloud Data Warehousing really for real?
I may be biased here (okay, likely), but based on my experience working with Snowflake for over a year now, I have to say yes. Emphatically, yes!
Cloud Data Warehousing is real. It can handle real data and real workloads. To the tune of hundreds of terabytes and even petabytes of structured, and semi-structured, data, all for a fraction of the cost of traditional on-premises data warehouse solutions, and with the ease of administration you expect from a cloud-based SaaS solution.
But, as they say, the proof is in the pudding!
So here are a few proof-points for you from real, live customers, who have been using Snowflake to improve their business outcomes.
AthenaHealth
AthenaHealth is a leading healthcare services provider (with a network of 85,000 providers and 83 million patients nationwide). So yes, it is possible to have a cloud data warehouse that is secure enough to pass HIPAA regulations for holding PHI (Personal Healthcare Information).
In this video, Adam Weinstein, Executive Director of Analytics & Data Science explains how AthenaHealth leverages the Snowflake Cloud Data Warehouse service to radically accelerate their reporting with real-time updates, more advanced analytics, and machine-learning, while minimizing overhead and maintenance.
Some of the key benefits AthenaHelth experienced using Snowflake:
Ability to work with petabytes of healthcare data
Ability to scale to meet analytic needs both internally and externally
Lower total cost of ownership (TCO) than other options
Ability to support machine learning-based products
Reduction in overhead maintenance thanks to the Snowflake service offering
Says Adam:
What I see Snowflake enabling us to deliver to our clients, internal stakeholders and paying customers will be pretty freaking cool!
Iovation
Iovation is the leading SaaS provider of fraud prevention and multifactor authentication solutions. So needless to say, they know security and they feel very secure with their data in the cloud.
In this video, Kurk Spendlove, Director of Engineering, shares why they switched from Vertica to the Snowflake Cloud Data Warehouse service in order to load semi-structured data directly into the cloud data warehouse and analyze years of data in a matter of minutes.
Some of the key benefits Iovation experienced using Snowflake:
Ability to load semi-structured data directly into Snowflake
Loading schema-less data – not having to modify schema every time data is changing in new weekly releases
Ability to scan through years’ worth of data and having the report back in minutes
Powerful support for new machine learning-based products
Minimize management for data warehouse and overhead
Kurk says:
I’m a big fan of Snowflake and the people behind it.
Rue La La
Rue La La is a flash sale site with over 18 million members looking for great deals on designer fashion and accessories.
Director of BI and Data Warehousing at Rue La La, Erick Roesch says:
Snowflake’s separation of compute and storage is just revolutionary!
In this video, he explains how they replaced their legacy data warehouse and Hadoop data lake with a Snowflake Cloud Data Warehouse to merge data sources for fast, data-driven business decisions.
Key benefits Rue La La saw from switching to Snowflake:
Merge different data sources for data-driven insights- 360-view of their customers!
Better targeted marketing and promotions to Rue La La members based on their personalized preferences
Better purchasing decisions for Merchandising and planning dept – they can learn more about context of the product, avoid having residual inventory of things that don’t sell
All data in one place in real time– internal and external data feeds (demographic, census, geo-location data)
No admin and infrastructure costs
Streamlined development cycles -traditional development activities and processes become very simple
Sharethrough
Sharethrough is the leading global native advertising (adtech) platform. In this short video listen to the Head of Analytics, Joseph Bates, explain how they were able to drastically reduce query times, streamline complex processes, and build new data pipelines by switching from MySQL to the Snowflake Cloud Data Warehouse.
Some key benefits Sharethrough saw from using Snowflake:
Reduced query times from hours to seconds (before, basic queries took an hour to return)
Streamline complex processes with minimal cost
“Query that used to take an entire weekend & $1,200 of compute time to run, now in Snowflake runs with bare minimum ETL, 4 lines of SQL in 30 seconds.”
Minimal database administration
Joseph’s conclusion:
The next step will be to see how we can build new data pipelines and meet the demands of our business, and I think Snowflake is unparalleled in this regard.
Cloud Data Warehousing is not just hype
Hopefully you can see by the passion and excitement from these customers, that it is not all hype. The promise of the cloud combined with a next-generation SQL-based data warehouse engine is in fact delivering the goods.
I am even more excited about the possibilities now than when I joined a year ago. It is awesome to see what these, and other companies are doing to transform their businesses and really challenging the status quo of in not only the data warehousing arena, but big data as well.
Cloud data warehousing is a game changer.
Maybe we can have it all?
For even more exciting customer stories check out the Snowflake channel on YouTube.
If this tech excites you too, please share on social media with any and all who love data and want to change the story for enterprise data warehousing! And don’t forget to follow Snowflake on twitter @snowflakedb for more customer success stories, upcoming webinars, and product announcements.
It is my 1 year anniversary of becoming the tech evangelist for Snowflake Computing!
Hard to believe that a year ago I gave up independent consulting and joined this amazing team in San Mateo. While there has been a lot of travel recently with my speaking schedule, I have gotten to learn a ton about big data, the cloud, and modern issues in the world of BI and analytics. Things I would have missed out on had I keep doing what I had been doing.
And the more I speak at events, talk to prospects, customers, analysts, and data architects, like me, the more convinced I am that Snowflake has NAILED it! What we have built here is truely unique in the industry and is led by a truely genius team of database engineers. You should come work here too! Tell them the Data Warrior sent you!
So, happy anniversary to me and here is yet another post I did about one of our great features.
Snowflake Information Schema
Snowflake has a data dictionary that we expose to users. We call it the Information Schema. This post will give you some examples of how to use it. Enjoy!
You must be logged in to post a comment.