What Is Snowflake Cloud Data Platform

What Is Snowflake Cloud Data Platform – A Beginners Guide to Key Concepts and Architecture

Snowflake, founded in 2012, is a cloud computing company that provides both a cloud-based data storage and analytics service. It is commonly referred to as "data-as-a-service,” but also considered a consumption-based or usage-based pricing model which is attracting many CIOs. In fact, around ½ of the Fortune 500 companies are now using Snowflake.   

Snowflake is becoming very popular due to the following reasons:

  • Consumption/Usage-based pricing models

  • Cloud agnostic so runs on Amazon AWS, Microsoft Azure and Google Cloud Platform (GCP)

  • Supports both structured and semi-structured data

  • Provides data sharing for both internal and external users, plus Snowflake and non-Snowflake users (through Reader accounts)

  • Improved concurrency and accessibility compared with older data warehouse architectures

  • Improved performance and speed

The purpose of this article is to provide a high-level architecture understanding and to introduce you to key Snowflake concepts.

High Level Architecture

The high-level architecture (see diagram above) consists of the following:

  • Data Sources which can consist of enterprise business or legacy applications, 3rd party data sources, web data, existing online transaction processing databases, etc.

  • Snowflake Cloud Data platform. Snowflake is a cloud-based data platform that’s provided as a fully managed service. It can be used for data warehousing, data lakes, data engineering, data analytics, data science, data application development, and for securely sharing and consuming shared data.

  • Amazon, Google, or Microsoft Cloud Layer. Snowflake runs on Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP). Snowflake cannot run on a private cloud infrastructure, either on-premises or hosted

  • Data Application layer: Various types of business apps such as a data warehouse, data lake, data marts, data science apps, or other data engineering apps can be written using the Snowflake platform

  • Business or Data Consumers. Data consumers can typically be functional business user groups within an organization, ad-hoc users, operational reporting users, accounting users, or real-time analytics users.

  • Data Analytics or Visualization Layer. Several types of analytic or visualization tools are often used with Snowflake including SQL Server Reporting Services (SSRS), Excel, Tableau, etc.

Key Snowflake Concepts

  • What is Snowflake Cloud Data Platform? Snowflake is a cloud data platform that’s provided as a fully managed service. It can be used for data warehousing, data lakes, data marts, data engineering, data analytics, data science, data application development, and for securely sharing and consuming shared data. It has been architected in a way that supports a near-unlimited number of concurrent workloads, so all users within an organization can be supported.

  • What is the Snowflake Cloud Data Platform Architecture? See the high-level architecture diagram and description above for a high-level overview. Snowflake has a unique architecture. The platform includes storage, computing, and cloud services layers that are physically separated, but functionally/logically integrated. This allows you to support virtually all users and data workloads to access a single copy of your data without impacting performance. With only a single copy of your data, there are no data silos or data integrity issues. Everyone has the same source of the truth.

  • What is a Consumption/Usage-Based Pricing Model: Snowflake charges by credits. A Snowflake credit is a unit of measure that’s used to pay for the consumption of resources on Snowflake. A Snowflake credit is consumed when a customer is using the resources, such as when a virtual warehouse is running, the cloud services layer is being used, or other Snowflake features are being used.

  • What is meant by scaling elastically? Snowflake provides automatic cloud elasticity, so when you need more capacity resources, Snowflake can automatically add them. You only pay for what you use.

  • What cloud service providers does Snowflake run on and what is meant by “cloud agnostic”: Snowflake has been running on Amazon Web Services (AWS) since 2014, Microsoft Azure since 2018, and Google Cloud Platform (GCP) since 2019. Unlike other cloud data warehouses, Snowflake doesn’t run on its own cloud. In fact, Snowflake cannot run on a private cloud infrastructure, either on-premises or hosted. Because it has a common, interchangeable code base, you can move your data to any of the 3 cloud providers, in any region/geography, without having to re-do application code.

  • What is provided by the cloud services provider or layer: The cloud services layer provides services such as authentication, infrastructure management, and access control. It also provides metadata management. As mentioned previously, it is written in a way that allows you to move your data from one cloud provider to another.

  • What is unique about Snowflake’s storage and computing model? Snowflake’s architecture separates storage from computing. Users are not competing for resources. Also, there are no limits on the # of queries or workloads that can be run simultaneously, and no limits on the # of users accessing data. All workloads can simultaneously leverage the computing power they need, when they need it.

  • What is meant by Snowflake’s Hybrid Architecture Snowflake’s architecture is a hybrid of shared-disk and shared-nothing architectures. Shared-nothing architecture is a distributed architecture, where each node is independent and self-sufficient. In shared-disk architectures, all data is accessible from all cluster nodes. Snowflake combines these two architectures, using a central data repository for persisted data that is accessible from all compute nodes. When processing queries, Snowflake uses massively parallel processing (MPP) compute clusters, and each node in the cluster stores a portion of the data set locally. With this hybrid model, Snowflake has the data management simplicity of a shared-disk architecture plus the performance benefits of a shared-nothing architecture.


Learn More

Snowflake is gaining in popularity for many reasons as described in this article. If you would like to discuss migrating to Snowflake or benefits that can be achieved for your company, please click the “Learn More” button below.

DRIFT IQ LLC DBA T EXPONENTS™ PRIVATE, PROPRIETARY AND CONFIDENTIAL. COPYRIGHT © 2023 T EXPONENTS™. ALL RIGHTS RESERVED.