The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “Snowflake Cloud Data Warehouse”

Okta SSO with Snowflake 

Ever wonder how to secure a cloud data warehouse? Well, Vlad from EA (Entertainment Arts) has produced an entire blog just about using Snowflake. This is the 1st in a series he wrote with detailed instructions on how to set up SSO on Snowflake for various tools, including Tableau. Check it out:

Once you made a decision (smart one!) to place you data warehousing and analytics activity onto Snowflake platform the next question would be how to make your data secure. Snowflake is really good …

Read the rest here: Okta SSO with Snowflake – Data Warehousing and Business Intelligence

Thanks Vlad!

Enjoy all!

Kent

The Data Warrior

Chief Technical Evangelist, Snowflake Computing

New Snowflake features released in Q2’17 

I have been busy lately preparing and delivering quite a few talks so got a bit behind on my blogging and reporting. So in an effort to catch up a bit, here are some details on developments at Snowflake:

Q2 1017 Features

It has been an incredible few months at Snowflake. Along with the introduction of self-service and numerous other features added in the last quarter, we have witnessed:

  • Our customer base has grown exponentially with large numbers of applications in full production.
  • Billions of analytical jobs successfully executed this year alone, with petabytes of data stored in Snowflake today, and without a single failed deployment to-date.
  • A strong interest in pushing the boundaries for data warehousing even further by allowing everyone in organizations to share, access and analyze data.

Continuing to engage closely with our customers during this rapid growth period, we rolled out key new product capabilities throughout the second quarter.

Get the rest of the details here: New Snowflake features released in Q2’17

Cheers

Kent

The Snowflake Data Sharehouse. Wow!

Data Sharing for All Your Data

They say the Internet changed everything…

Then Big Data changed everything…

Then the Cloud changed everything…

Well my friends, Snowflake‘s announcement of its new data sharing feature has changed the game again! Your data warehouse in the cloud can now be a data sharehouse.

Building on all these technology evolutions, Snowflake has taken what we can now do with big data in a cloud-native data warehouse to whole new level by introducing, what I like to think of as Data Sharing as a Service (DSaaS).

This may be my new #1 favorite feature of Snowflake.

What is Snowflake Data Sharing?

Snowflake Data Sharing is a new feature that lets you easily, seamlessly, and securely, share tables, views, even entire databases with anyone inside the Snowflake ecosystem, in a read only mode. They can then query the data from within their own Snowflake account and even join it to their own internal data as if it was all in their database.

Snowflake Data Sharing architecture

That means no more needed to reformat and export data to flat files so they can be transmitted (via secure FTP or some other transfer protocol) to then be loaded into your customer’s or partner’s database.

All that time and effort – gone!

Data extraction process – gone!

Data movement – gone!

Data latency – gone!

Extra storage – gone!

You create your database, load the data, then share the data. And once the data object is shared, as you add more data or update the data set, those changes are immediately available for the data consumers to query. No more wasted time waiting for an incremental update file to be built and transmitted.

And you have complete control on who sees what data. In fact you can revoke anyones access instantly with a single command.

Oh – did I mention that the new feature is FREE to all Snowflake customers. It is built into the standard edition! (That’s just crazy!)

How does it work?

The reason that only Snowflake can do this is because of its unique multi-cluster, shared data architecture that completely separates compute resources from storage. That is why the data can be stored once (by the data provider) and then be shared to an unlimited number of data consumers. The global meta data and security services in Snowflake’s cloud services layer are key components that allow sharing to be not only fast but secure. With independent compute clusters (i.e., virtual warehouses), data consumers can use whatever amount of compute they require to query and use the shared data without impact on either the data provider or other data consumers.

So the basic process for data sharing is simple:

  1. Data Provider creates a share container with the objects (databases, schemas, tables, or views) to be shared.
  2. Data Provider then grants a Data Consumer account access to the share.
  3. Data Consumer creates new database that maps to the shared object(s).
  4. Data Consumer then grants access privileges to a role in their account
  5. Data Consumer starts querying, using the privileged role and their virtual warehouse.

Snowflake Data Sharing setup

Code examples:

Data Provider code:

Here is a scenario where the data provider wants to share just a single table in a database to several accounts. This approach allows the provider to verify the configuration and contents of the share before making it visible to other accounts (this is the recommended approach).

CREATE SHARE sales_s1; -- create an empty share

GRANT USAGE on DATABASE sales to SHARE sales_s1; -- add database

GRANT USAGE on SCHEMA sales.east to SHARE sales_s1; -- add schema

GRANT SELECT on TABLE sales.east.new_orders 
             to SHARE sales_s1; -- add table

SHOW SHARES;

ALTER SHARE sales_s1 ADD ACCOUNTS=a1, a2, a3; -- add accounts

Data Consumer code:

On the consumer side, each account would create a database from the share sales_s1, then grant access to the new database in order to access the table NEW_ORDERS.

CREATE DATABASE External_SalesData from SHARE ProviderAcct1.sales_s1;

GRANT IMPORTED PRIVILEGES on DATABASE External_SalesData to MyRole;

Security – Revoking a Share

If for some reason a Data Provider needs to stop sharing their data either to a single account or to everyone, that is also easy to do. They can either REVOKE the privileges granted or completely DROP the share.

REVOKE SELECT ON TABLE sales.east.new_orders
  FROM SHARE sales_s1;

or just

DROP SHARE sales_s1;

Unlimited Possibilities for the New Data Economy

So, how can your business change and grow with this capability (that costs you nothing)? Do you have partners that have wanted access to your data but found it too difficult to engineer that data pipeline? Is there a market for your data, and the insights it provides, that you have not even explored?

This feature redefines the old Data Warehouse into a modern Data Sharehouse that lets you derive even more value from all your data – with no limits.

With Snowflake Data Sharing, you can now transform your data into a valuable, strategic business asset.

For More Information

For more details on Snowflake Data Sharing, check out these posts:

https://www.snowflake.net/data-sharehouse-brings-forth-new-market/

https://www.snowflake.net/data-sharehouse/

Then download the free ebook “From Data Warehouse to Data Sharehouse” for an even more in-depth look at Snowflake Data Sharing

And signup for the live webinar “A Deeper Look at Data Sharing” coming next week.

So what do you think? How could this change your business?

Cheers.

Kent

The Data Warrior

Snowflake and Spark, Part 2: Pushing Spark Query Processing to Snowflake

Here is the latest post on using Spark and the Snowflake cloud-native data warehouse.

Welcome to the second post in our ongoing blog series describing Snowflake’s integration with Spark. In Part 1, we discussed the value of using Spark and Snowflake together to power an integrated data processing platform, with a particular focus on ETL scenarios.

In this post, we change perspective and focus on performing some of the more resource-intensive processing in Snowflake instead of Spark, which results in significant performance improvements. As part of this, we walk you through the details of Snowflake’s ability to push query processing down from Spark into Snowflake. We also touch on how this pushdown can help you transition from a traditional ETL process to a more flexible and powerful ELT model.

Read the rest: Snowflake and Spark, Part 2: Pushing Spark Query Processing to Snowflake

Enjoy!

Kent

The Data Warrior

Cloud Analytics Conference – London!

Next up on The Data Warrior speaking tour 2017 is the Snowflake Cloud Analytics Conference in London on June 1st!

CloudConference

Snowflake is kicking off this year’s Cloud Analytics City Tour with a blow out event in London, England. This will be a full day workshop style event where you get to hear and learn from industry veterans and thought leaders like myself, and the CEO of Snowflake Computing, Bob Muglia (to name just a few). In addition we will have a Practitioner Panel discussion that includes several of our customers along with other industry thought leaders.

The unique value proposition for this event is that in the afternoon you can choose from two tracks of in depth sessions related to implementing your BI solutions and your data warehouse in the cloud.

I will be presenting my talk Agile Methods and Data Warehousing: How to Deliver Faster. My highly seasoned colleagues from Snowflake (all industry experts) will teach you about loading data in the cloud, deploying BI in the cloud, and how to best use Snowflake to be successful with your cloud analytics program.

And of course there will be food, drinks, and networking.

You can find all the agenda details here along with the registration form. Use discount code DATAWARRIOR for 50% off the registration fee.  Sign up today!

This will be my first time ever in London, so if you are in the area, please come by, say “hi” and learn about the new world of Cloud Analytics.

Until then, cheers!

Kent

The Data Warrior

P.S. I will be in London the day before and after the event, so if you want to have a more detailed or personalized discussion of the benefits of cloud-native data warehousing, please reach out to me at kent.graziano@snowflake.net.

Post Navigation

%d bloggers like this: