The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “advanced analytics”

Snowflake and Spark, Part 2: Pushing Spark Query Processing to Snowflake

Here is the latest post on using Spark and the Snowflake cloud-native data warehouse.

Welcome to the second post in our ongoing blog series describing Snowflake’s integration with Spark. In Part 1, we discussed the value of using Spark and Snowflake together to power an integrated data processing platform, with a particular focus on ETL scenarios.

In this post, we change perspective and focus on performing some of the more resource-intensive processing in Snowflake instead of Spark, which results in significant performance improvements. As part of this, we walk you through the details of Snowflake’s ability to push query processing down from Spark into Snowflake. We also touch on how this pushdown can help you transition from a traditional ETL process to a more flexible and powerful ELT model.

Read the rest: Snowflake and Spark, Part 2: Pushing Spark Query Processing to Snowflake

Enjoy!

Kent

The Data Warrior

Advertisements

#Kscope16 Blog Hop: #BigData and #AdvancedAnalytics Sessions Not to Miss

You are attending #KScope16 right?

Me too.

But there are so many sessions to choose from (mine included), which do you pick? How do you pick?

Well, I (and my fellow bloggers) are here to help you out with a Blog Hop. We are going to give you our top picks for for each track. In this post, I will give you my picks for the Big Data and Advanced Analytics track.

Big Data and Advanced Analytics Sessions

Why did I pick that this track? Really because it is a necessary adjunct to BI and Data Warehousing. In fact I find it hard to imagine that these two really won’t merge over the next few years (at my company, Snowflake, it really has already). Every company that is investing in BI/DW is also finding that they need to deal with Big Data too. And Advanced Analytics is, to me, the logical extension to BI.

So after looking at the agenda, really most of the sessions are of interest to me (sigh). But in reality I am sure I will not be able to attend them all, so here are my top 5 picks to see at KScope16:

  1. How to Build an Internet of Things Data Pipeline presented by Rex Eng
  2. Oracle Big Data Discovery: Extending into Machine Learning and Advanced Visualizations presented by Mark Rittman
  3. Introduction to Apache Kafka and Real-Time ETL presented by Gwen Shapira
  4. Getting Started with a Data Discovery Lab: You Don’t Have to Go Big to Gain Big presented by Kathryn Watson
  5. Getting Started with Oracle R and OBIEE presented by Kevin McGinley
 Why those? Simply because they hit on all the top issues and topics that see being discussed (or written about) in the field, and I need to get a better grip on these things:
  • IoT – it is here already
  • Machine Learning – I am pretty clueless about this one so far
  • Kafka – ETL/ELT in the cloud
  • Data Discovery – the next step beyond BI
  • R – the language of choice for data scientists

And I actually know all of but one of the presenters, so am sure they will be very informative and lively talks.

The rest of the blog hop:

Thanks for attending this ODTUG blog hop!

Looking for some other juicy cross-track sessions to make your Kscope16 experience more educational? Check out the following session recommendations from fellow experts!

I hope this gives you some great ideas on what to see at KScope16!

See you in Chicago.

Kent

The Data Warrior

P.S. Don’t forget to make time to attend my Morning Chi Gung sessions down by the river to get each day started right with a clear mind and strong heart. Look for signs at the hotel.

 

Post Navigation

%d bloggers like this: