The Data Warrior

Changing the world, one data model at a time. How can I help you?

Archive for the tag “#bigdata”

What is Modern Data Virtualization?

If you have been wondering about #DataVirtualization in this age of #cloud and AI, wonder no more!

Join my friends from Zetaris for a Virtualization 101 workshop – where knowledge meets excitement! Register here!

When you register, you will be entered to win one of three amazing prizes including the newly released Gen 3 Airpods as the main prize. The winners will be revealed during the event for everyone that enters.

Data Virtualization is the future of a multi-cloud and distributed data world. AI needs data from different sources, in real-time, to take it to the next level towards General AI. Join this webinar to get insights into what data virtualization can do for your organization today.

Date: March 15th, 2023

Time: 12pm-1pm PST

Cost: FREE

Topics: 

  • What is Data Virtualization?
  • What is the difference between Analytical Data Virtualization and Operational Data Virtualization?
  • How does the traditional approach to building data platforms (data layers that support the business) change to a modern
  • Decentralized data approach (a Networked Data Platform)?
  • What are the use cases?

Sign up now so you don’t miss out.

Ciao!

Kent

The Data Warrior

The Elephant in the Data Lake and Snowflake

So is Hadoop finally dead? For many use cases, I think it really is. The cloud and the continued evolution of technology has created newer, better ways of working with data at scale. Check out what Jeff has to say about it!

Jeffrey Jacobs, Consulting Data Architect, Snowflake SnowPro Core Certified

Let’s talk about the elephant in the data lake, Hadoop, and the constant evolution of technology.

Hadoop, (symbolized by an elephant), was created to handle massive amounts of raw data that were beyond the capabilities of existing database technologies. At its core, Hadoop is simply a distributed file system. There are no restrictions on the types of data files that can be stored, but the primary file contents are structured and semi-structured text. “Data lake” and Hadoop have been largely synonymous, but, as we’ll discuss, it’s time to break that connection with Snowflake’s cloud data warehouse technology.

Hadoop’s infrastructure requires a great deal of system administration, even in cloud managed systems.   Administration tasks include: replication, adding nodes, creating directories and partitions, performance, workload management, data (re-)distribution, etc.  Core security tools are minimal, often requiring add-ons. Disaster recovery is another major headache.  Although Hadoop is considered a “shared nothing” architecture, all…

View original post 809 more words

Snowflake and Spark, Part 1: Why Spark? 

Snowflake Computing is making great strides in the evolution of our Elastic DWaaS in the cloud. Here is a recent update from engineering and product management on our integration with Spark:

Spark

This is the first post in an ongoing series describing Snowflake’s integration with Spark. In this post, we introduce the Snowflake Connector for Spark (package available from Maven Central or Spark Packages, source code in Github) and make the case for using it to bring Spark and Snowflake together to power your data-driven solutions.

Read the rest of the post: Snowflake and Spark, Part 1: Why Spark?

Enjoy!

Kent

The Data W

Cloud Data Warehousing for Dummies

As we all know, cloud is the big thing these days. Getting bigger everyday it seems.

It may get even bigger than Big Data!

If you, like me, are a data warehousing or BI professional, you have probably been wondering how this all fits in the cloud world. You may have even heard of data warehousing  “in the cloud”.

But what does that really mean? What is a cloud data warehouse?

Well thanks to Snowflake Computing, it just got a little easier to answer this question.

They sponsored the development of a new book called Cloud Data Warehousing for Dummies. Yup, an actual Dummies guide for this. And yes, yours truely, got to have a hand in editing and writing the book.

And the best part – it is FREE!

clouddw_dummies

Researching and helping to write the book was very educational for me. I learned a lot in the process about what constitutes a cloud data warehouse, the difference between a platform in the cloud and a real service in the cloud, and what characteristics folks should look for when choosing one.

I also learned to say “on-premises” instead of “on-premise.” 🙂

Content

The chapters of the book cover:

  • An introduction to cloud data warehousing
  • Why the modern data warehouse emerged
  • The criteria for selecting a modern data warehouse
  • On-premises vs cloud data warehousing
  • Comparing cloud data warehousing solutions
  • A six-step guide to choosing a cloud data warehouse

It also includes several real-world customer case studies.

Even though Snowflake sponsored the book, it is vendor agnostic. It really is a book designed to get you introduced to the concepts and to get you thinking about what you might want in a cloud-based data warehousing system.

It is ideal for anyone who is considering making that transition to the cloud.

So head on over to this site and download your FREE copy today!

To infinity and beyond!

Kent

The Data Warrior (with his head in the clouds)

P.S. Forward this to a friend so they can download a copy too!

 

 

Drill to Detail Podcast: Data Modeling, Data Vault, and Snowflake!

My good friend Mark Rittman has embarked on a new adventure as an independent analyst and consultant. As part of his new venture Mark started a new podcast series on iTunes that he calls Drill to Detail where he will feature interviews discussing a range of topics related to data warehousing, business intelligence, analytics, and big data.

I was honored to be asked to take part in this new venture and got to spend a hour with Mark a few weeks back recording what is now Episode 5 of the series. In this interview we talk about:

The podcast is about 60 minutes with each topic being about 20 minutes (so feel free to skip ahead if you are short on time). Please have a listen and let us know what you think in the comments below.

Cheers!

Kent

The Data Warrior

P.S. I will be speaking on these and related topics at a bunch of events over the next few weeks. Check out my speaking schedule and join me in person if you can!

Post Navigation

%d bloggers like this: