The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “Spark”

Snowflake and Spark, Part 2: Pushing Spark Query Processing to Snowflake

Here is the latest post on using Spark and the Snowflake cloud-native data warehouse.

Welcome to the second post in our ongoing blog series describing Snowflake’s integration with Spark. In Part 1, we discussed the value of using Spark and Snowflake together to power an integrated data processing platform, with a particular focus on ETL scenarios.

In this post, we change perspective and focus on performing some of the more resource-intensive processing in Snowflake instead of Spark, which results in significant performance improvements. As part of this, we walk you through the details of Snowflake’s ability to push query processing down from Spark into Snowflake. We also touch on how this pushdown can help you transition from a traditional ETL process to a more flexible and powerful ELT model.

Read the rest: Snowflake and Spark, Part 2: Pushing Spark Query Processing to Snowflake

Enjoy!

Kent

The Data Warrior

Snowflake and Spark, Part 1: Why Spark? 

Snowflake Computing is making great strides in the evolution of our Elastic DWaaS in the cloud. Here is a recent update from engineering and product management on our integration with Spark:

Spark

This is the first post in an ongoing series describing Snowflake’s integration with Spark. In this post, we introduce the Snowflake Connector for Spark (package available from Maven Central or Spark Packages, source code in Github) and make the case for using it to bring Spark and Snowflake together to power your data-driven solutions.

Read the rest of the post: Snowflake and Spark, Part 1: Why Spark?

Enjoy!

Kent

The Data W

Post Navigation

%d bloggers like this: