© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2022
A. MortonMastering Snowflake Solutionshttps://doi.org/10.1007/978-1-4842-8029-4_4

4. Managing Security and Access Control

Adam Morton1  
(1)
Sydney, NSW, Australia
 

A fundamental pillar of good data management is having a robust approach to data access and security controls. In today’s world, securing, managing, and auditing access to sensitive data is paramount to keeping data safe.

As regulatory and IT security teams struggle to keep up with the rate of change, technology providers continue to offer new and innovative ways of supporting these efforts. Snowflake is no different. It offers a range of interesting capabilities in this space.

In this chapter, you will explore security and access management in Snowflake. You’ll start by understanding the role hierarchy along with how privileges are granted and inherited. You’ll also cover how to extend the out-of-the-box functionality by creating your own custom roles.

You’ll build upon these fundamental principles and investigate how Snowflake supports authentication and single sign-on (SSO) use cases. You’ll consider this from an end user perspective. You’ll also consider programmatic access for applications using service accounts.

You’ll move on from authentication to touch upon the network controls Snowflake provides and how to uplift this functionality if your organization dictates it.

I’ll introduce you to a neat approach to handling personally identifiable information (PII). Finally, I’ll wrap things up by discussing some advanced Snowflake security features, providing you with the comprehensive knowledge you need to ensure your Snowflake implementation is secure.

Roles

Roles are a collection of privileges against one or more objects within your Snowflake account. There are five predefined system defined roles, as detailed in Table 4-1.
Table 4-1

Predefined System Roles

Role

Description

ACCOUNTADMIN

The most powerful role available, it administers the Snowflake account. This role can view all credit and billing information. The other admin roles are children of this role: SYSADMIN and SECURITYADMIN.

Snowflake recommends limiting the use of this account and restricting access to a minimum set of users. You should avoid setting this role as a default for any user. Additionally, I strongly recommend configuring multi-factor authentication (MFA) on this account, which I’ll cover later in this chapter.

SYSADMIN

This role can create warehouses and databases and all objects within a database (schemas, tables, views, etc.)

SECURITYADMIN

This role is designed for the administration of security. This includes the management of granting or revoking privileges to roles.

USERADMIN

This role is used for creating roles and users and managing the privileges assigned to them.

The USERADMIN role is a child of the SECURITYADMIN role.

PUBLIC

The PUBLIC role is a default role that all users end up in automatically. This provides privileges to log into Snowflake and some basic object access.

These system-defined roles provide a starting point for you to build upon to fit the needs of your own organization. Custom roles can be added to the hierarchy (using the SECURITYADMIN role) and rolled up to the SYSADMIN role. Customizing and extending the hierarchy is something I’ll cover in more detail later in this chapter.

As a user, you can be a member of more than just one role, which is typical across a range of applications you might have used previously. Some systems look at all the roles your user is assigned to in order to understand what permissions you have, before deciding to perform your requested action. Snowflake is different. It forces you to select a role before you run any queries.

For example, let’s say you create a new warehouse using the SYSADMIN role and now you need to create a new role for a user. If you attempt to perform this operation using the SYSADMIN role, you’ll get an error similar to
SQL access control error: Insufficient privileges to operate on account 'XXXXXX'

You need to switch to either the SECURITYADMIN or USERADMIN role to perform this action. This behavior forces users to follow role-based access control tightly.

Role Hierarchy

Figure 4-1 illustrates the role hierarchy in Snowflake. It follows a role-based access control (RBAC) framework, which allows privileges to be granted between objects and roles. Roles contain users and therefore the objective of the RBAC approach is to simplify the maintenance of access control.

Note

Roles can also be assigned to other roles, resulting in a role hierarchy. This creates a level of flexibility, but also introduces the risk of additional complexity if you’re approach isn’t well thought out.

Figure 4-1

Standard role hierarchy

RBAC is a widely used concept and, if you have had experience using it, you know it can get complicated very quickly. A poor design also leads to a lot of unnecessary effort when maintaining a hierarchy.

Inheritance

Not only can privileges on objects be assigned directly to roles, but they can also be inherited from other roles. To illustrate this, let’s use an example business requirement that needs to be in place to support a new project.

In this scenario, the HR_SCHEMA_READ_ONLY role has read-only access to HR data. The Marketing Insight team requires read-only access to this data for the duration of the project. This can be achieved using the GRANT ROLE...TO command, as follows:
GRANT ROLE HR_SCHEMA_READ_ONLY TO ROLE MARKETING_INSIGHT;

This single command grants the Marketing Insight team read-only privileges on the HR schema and data contained within it.

At the end of the project, when you need to remove the privileges, you can simply run the REVOKE ROLE...FROM command:
REVOKE ROLE HR_SCHEMA_READ_ONLY FROM ROLE MARKETING_INSIGHT;
In this instance, you’re using HR_SCHEMA_READ_ONLY as a parent role to handle the privileges assigned directly to objects, while the MARKETING_INSIGHT role contains the users relevant to that business department. This separation, illustrated in Figure 4-2, is a key principle that you’ll continue to explore in this chapter.
Figure 4-2

Role inheritance

I am sure you’re starting to see that this provides not only a large amount of flexibility, but that it’s easy to create layers upon layers of roles resulting in something resembling a spider web! Over time, this can grow to be a highly complex beast. Unpicking it would require a lot of time and effort and is often avoided by IT teams due to the inherent risks involved.

If you have an approach like the one I am going to present in this chapter, it does provide an opportunity. Instead of creating each custom role from scratch, you can create a layer of abstraction where one layer of roles can inherit the privileges of another layer.

Objects

When a user creates an object in Snowflake, they effectively own that object and can grant access to it. This is known as discretionary access control (DAC). RBAC then comes into play as the owner grants privileges on the objects to specific roles.

Figure 4-3 shows what a simplified role hierarchy might look like in Snowflake. Here, you have the overall Snowflake account at the top with a number of child roles.
Figure 4-3

Object hierarchy

Extending the Role Hierarchy

One important principal to follow here is to separate the roles that are used to grant access to objects from those roles that are used to contain users. Adhering to this principal provides greater flexibility, leverages inheritance, and streamlines the overall implementation effort.

Logically we can break the roles down into specific levels, as shown in Table 4-2.
Table 4-2

Logically Separating Out Roles

Role Level

Role Type

Purpose

Level 0

System Roles

The predefined system roles that all custom roles should roll up to.

Level 1

Domain Roles

Used when you require groups of Level 2 and 3 roles separated for different environments, such as Dev, Test, and Production.

Level 2

Functional Roles

Used to contain users. This may be mapped to roles within an identity provider (IdP) which contains groups of users.

Level 3

Access Roles

Used to assign privileges on the actual objects within Snowflake.

Beginning with this framework in mind (as Figure 4-4 illustrates) makes it easier to capture and organize requirements within your own environment.
Figure 4-4

An example role hierarchy mapped to logic levels

I prefer this simple, clean approach to adopting the RBAC framework Snowflake provides. It’s important to note that although inheritance has been leveraged, it has been carried out in a controlled manner. Object access is controlled in one layer and user access is controlled in another. This promotes reuse of roles while easing and simplifying ongoing operational management.

User and Application Authentication

Snowflake provides a range of capabilities to support authentication for both users and applications.

In this section, you’ll take a closer look at the two primary use cases you’ll come across:
  • Interactive authentication for individual users

  • Authentication for applications using system accounts

Before I get into the specifics for these two methods, I need to provide you with a primer in multi-factor authentication, Security Assertion Markup Language (SAML), and OAuth.

Multi-Factor Authentication

Multi-factor authentication provides an additional layer of security by requiring a user to provide a secondary form of authentication over and above the standard login to Snowflake. This secondary form of authentication is typically a mobile device.

Increasingly, MFA is being rapidly rolled out across a variety of applications that store sensitive data such as banking apps. It aims to secure access for any user and their device regardless of their location.

MFA is a built-in feature for all Snowflake editions. It’s supported by an external company called Duo (owned by Cisco) although the actual MFA process is managed by Snowflake. As a user, you are required to download and install the Duo Mobile application to use this feature.

Note

Due to the level of access it has over your account, Snowflake recommends always setting MFA for the ACCOUNTADMIN role.

In line with Snowflake’s principal of keeping administrational tasks to a minimum, you cannot enforce MFA for all your Snowflake users. Instead, users are required to self-enroll by changing their user preferences, as Figure 4-5 shows.
Figure 4-5

MFA in the Snowflake UI

As part of the enrollment process you’ll be asked to select the device you are adding, as Figure 4-6 shows. Amazingly, there is support for a landline. However, a mobile phone is recommended and is of course the most common option.
Figure 4-6

MFA device options

To complete the enrollment, you’ll be presented with a QR code to scan with your mobile phone or tablet from within the Duo Mobile app (Figure 4-7). This step registers your device to your account.
Figure 4-7

QR code to link your device to with your MFA registration

Once enrolled, when you log into Snowflake, you’ll be presented with the screen shown in Figure 4-8. This prompts you to select a way you wish to authenticate using a secondary device. Figure 4-9 bring this all together by describing the overall MFA login flow.
Figure 4-8

MFA authentication prompt screen

Figure 4-9

MFA login flow

Although you cannot enroll users in MFA, as an Account Administrator you can provide support if they don’t have access to their mobile device, by either disabling MFA or granting a period when they can bypass MFA.

Bypassing MFA for a user for 60 minutes:
ALTER USER MY_USER_NAME
SET MINS_TO_BYPASS_MFA = 60;
Disabling MFA for a user (requires the user to re-enroll):
ALTER USER MY_USER_NAME
SET DISABLE = TRUE;
To verify that MFA is disabled for a given user, check the value for the EXT_AUTHN_DUO property:
DESCRIBE USER MY_USER_NAME;

MFA Caching

To reduce the number of times users receive the MFA prompt, you can enable MFA caching. This works by storing a version of the token on the client’s machine, which is valid for up to four hours. This parameter value is available at the overall account level only and therefore enabled globally across all users enrolled in MFA.

Note

Before enabling MFA caching, consult with your own IT Security and Compliance teams to check policies and standards.

Security Assertion Markup Language

In this section on SAML , I’ll give you enough of a primer so you can make sense of it and relate it to the authorization methods I’ll discuss later in the chapter. Even though I won’t be taking a deep dive into everything on SAML, be prepared for a lot of acronyms; unfortunately, they’re unavoidable!

SAML offers an open standard which identity providers can use to provide a single sign-on experience. In practice, this means a user needs to provide one set of credentials up front to gain access to several applications. Fewer usernames and passwords to remember must be a good thing, right? Popular examples of IdPs are Okta and Microsoft Active Directory Federation Services (ADFS). In fact, these two particular IdPs provide native support for federated authentication using SSO. Other vendors may require a level of customization depending on the vendor. This is outside the scope of our discussion in this book, however.

Note

SAML 2.0 was introduced in 2005 and is the de facto standard in helping to maintaining security within a federated authentication environment.

In the background, SAML uses XML (Extensible Markup Language) to handle the sharing of credentials between the IdP and a service provider (SP) which, in your case, is Snowflake. (I did warn you about the number of acronyms!)

OAuth

OAuth is an open standard to support authentication. I mentioned that it was introduced in 2005. The world has obviously moved on somewhat since then. SAML was (and still is) primarily focused on enterprise-scale security whereas OAuth is more suited to the mobile experience. OAuth uses JSON behind the scenes and not XML; this is another clue to the point in time when they were both introduced!

Note

OAuth 2.0 is the latest version and what Snowflake supports. Whenever I refer to OAuth, I am referring to version 2.0 in all cases.

OAuth bridges some shortcomings of SAML in this space. You’re probably more familiar with using OAuth then you think. You can tell Facebook that it’s ok for Twitter to post to your feed without having to provide Twitter with your Facebook password, for example. This limits the number of companies that need to store your password, reducing risk for you as the user. This is OAuth.

Note

There’s a risk of getting into lots of detail with all the different ways this process can work across a variety of scenarios. As we’re focusing on Snowflake, I am not going to subject you to all the information, just the most relevant areas.

There are four main players in an OAuth transaction:
  • Resource owner: The person requesting access to the data

  • Resource server : The API that stores the requested data

  • Client application : The application requesting access to the data

  • Authorization server : Runs the OAuth engine, verifies the client, and generates the token

If you don’t provide your password, how can applications verify you? This is done through a token, generated at the authorization stage by a server which verifies the user and returns a token permitting access to the data.

You need to use OAuth when attempting to make a connection using another client application, such as Tableau, and want an SSO-style experience. Snowflake has several partner applications where Snowflake provides OAuth support. In these cases, Snowflake acts as both the resource and authorization server .

Key Pair Authentication

Another alternative to providing a password to authenticate is to use key pair authentication. This reduces the risk of someone discovering your password and gaining access to your account.

It works on the basis of private and public keys. A pair of these keys are generated. Anyone can know the public key, as on its own it is useless. The private key should be secured, however, as this key generates signatures. These signatures can only be generated by the person who has the private key, while the public key can verify the signature as genuine.

Once you’ve generated a key pair, the public key is managed by Snowflake, while the secret key should be managed in a secret management platform such as AWS Secret Manager or Azure Key Vault.

With that grounding out of the way, let’s look at the five authentication options for Snowflake and when you would use them:
  1. 1.

    A user provides a username and password through the web UI.

     
  2. 2.

    Username and password with the addition of MFA

     
  3. 3.

    SSO using SAML

     
  4. 4.

    OAuth process allowing access to Snowflake data

     
  5. 5.

    Key pair authentication

     

Options 1 and 2: If you need to provide a username and password (option 1), Snowflake advises against using this approach because it is the least secure mechanism to use. In this case, you should introduce MFA and follow option 2 as a preferred method of authenticating.

Option 3: This is used for human access in a federated authentication environment, in situations when your organization uses an IdP to manage access control centrally and provides an SSO experience for employees. In this scenario, when a user attempts to log into Snowflake, it requests authorization from the IdP for the user trying to log in. The IdP checks the privileges for that user and sends a response back to Snowflake, either allowing or denying access.

Option 4: OAuth is used when a user wants to access data through a client such as Tableau.

Option 5: Designed for programmatic or service account access. Using a key pair authorization approach combined with a secrets manager is the way to go here. The client uses its private key, while Snowflake uses public keys to decrypt and authenticate .

Storage Integration

In Chapter 2, you enabled public access to your S3 bucket so you could see how data ingestion worked from an external stage. In the real world, you need to lock down access to any external stage without exposing the security credentials to users in the process.

The way to achieve this is by creating something called a storage integration. This object generates an entity that links to an IAM entity, which is used to access your external cloud storage location.

There are a few major benefits to this. Firstly, you don’t need to share your credentials with the users who need to access your external stage to load or unload data. Secondly, you can create a single storage integration object that points to more than one external stage. Finally, you can grant access just to this single object within your cloud provider’s account, simplifying the administration.

For more details on how to create a storage integration object, you can visit the official Snowflake documentation page here: https://docs.snowflake.com/en/sql-reference/sql/create-storage-integration.html.

Network Policies

The migration of data workloads to the cloud brings new considerations when deciding how to secure access to the data from a networking perspective. Previously, traditional data platforms were protected through the virtue of being buried deep within layers of private network perimeters.

Within Snowflake, there are three primary ways to apply network security policies. They are listed below. I’ll discuss each of these options in detail next.
  1. 1.

    Native out-of-the-box network security

     
  2. 2.

    Network policies

     
  3. 3.

    Cloud service provider capabilities

     
Note

It is worth pointing out that option 1 is mandatory and always in play. You cannot turn the native features off. Options 2 and 3 can be used independently or together depending on your own security requirements.

Option 1: Native Network Security

For the vast majority of customers, the out-of-the-box network security Snowflake provides should meet the requirements. Therefore, this is the most common configuration you should expect to see.

Snowflake uses Transport Layer Security (TLS 1.2) encryption for all customer communication. It also uses something called OCSP (Online Certificate Status Protocol), which checks the validity and integrity of the certificates used for establishing the TLS tunnels.

Option 2: Network Policies

By default, Snowflake allows users or applications from any IP address to access the service. To provide an additional layer of security and allow or deny specific IP addresses access to Snowflake, you can use network policies.

These network policies can be applied at the account level, which applies them globally, or at a user level. As with many features of Snowflake, you can apply these policies using SQL as the following syntax shows:
CREATE [ OR REPLACE ] NETWORK POLICY <name>
   ALLOWED_IP_LIST = ( '<ip_address>' [ , '<ip_address>' , ... ] )
   [ BLOCKED_IP_LIST = ( '<ip_address>' [ , '<ip_address>' , ... ] ) ]
   [ COMMENT = '<string_literal>' ]
Once a policy has been created, you need to activate it using the ALTER ACCOUNT command in tandem with the NETWORK_POLICY parameter.
ALTER ACCOUNT SET NETWORK_POLICY = <policy name>

Option 3: Cloud Service Provider Capabilities

This approach involves an additional layer of security by leveraging the cloud service provider’s (CSP) private networking features. Here you can use AWS PrivateLink or Azure PrivateLink, which establish a private, point-to-point connection for all client-side initiated connections.

Additionally, you can layer the network policies and the CSP capabilities options on top of each other. This offers a comprehensive approach to network security management when using Snowflake.

Handling PII Data

The European Union’s General Data Protection Regulation (GDPR) was published in 2016 and implemented in 2018. During this time, I was working as a Head of Data for a FTSE 100 Insurer.

This quickly became a hot topic within the organization as we stored a lot of data relating to our customers. Suddenly we were in a position when we had to interpret what this legislation meant, along with the impact it had on our business process and technical architecture.

Of course, we were not alone in this process. It impacted a huge range of sectors. We’re now in the position where a lot of the legislation is clearer and better understood. This understanding has led to approaches and architectural designs that are GDPR compliant. It’s less a function of the data warehouse and more a function of the design you adopt. Therefore, spending some thinking time up front can save a lot of headaches in the future.

One of the major issues we found when working to bring our data practices in line with GDPR was the right to erasure (more commonly known as the right to be forgotten) in GDPR Article 17. Once an individual requests an organization to delete their PII from its database, the organization has a short period of time (between 30 to 90 days) to remove this personal data from databases, copies, and backups.

This raises some questions:
  1. 1.

    How do you ensure all PII data is removed from the database?

     
  2. 2.

    How do you cleanly remove this PII data without breaking the integrity of your data model?

     
  3. 3.

    How do you make this process as efficient as possible for users to administer?

     

Separately Storing PII Data

The first recommendation is to physically separate PII data from non-PII data. This data might be in one table or a group of tables between two different schemas, for example. This allows you to pinpoint all the PII data you hold in your data warehouse and wrap appropriate controls around it.

Imagine the issues you could run into if you need to remove values from certain columns of PII data within an existing table? Deleting the entire row of data should be avoided as this may potentially remove non-PII data that is valuable for reporting or analytical needs. By cleanly separating the PII data away, you reduce the risk in breaking your data model or removing data unnecessarily in the future,

You will need to consider how your users will access this data, especially if a group of users requires access to both PII and non-PII data. Therefore, you must have a way of joining both sensitive and non-sensitive data together again. One way to surface this data to your users is to use secure views, which check what role the user is in before deciding what data to return.

This solution works elegantly with the RBAC hierarchy, as you can create and grant (or deny) privileges on roles to be able to access the data in the PII schema and tables.

Removing Data in Bulk

As part of the separate PII schema described in the section above, I like to add a couple of columns that allow records to be flagged for removal along with a date your organization received that request.

This allows you to batch up the removal of records and execute the process periodically within the time frames set out by the GDPR legislation. It’s far easier to manage and cater for an operational process that runs periodically rather than each and every time you get a request from an individual to move their data.

Note

You should also consider your Time Travel settings in your PII data in Snowflake. Depending on your settings, Snowflake may have a version of this record stored beyond 30 days, which could allow this data to be mistakenly restored after it’s been removed.

Auditing

I like having visibility into my processes. It provides greater control and transparency into what is happening, while also supporting troubleshooting activities should anything go wrong. I recommend introducing a metadata table to support the monitoring of this process. Having a centralized table that contains the following as a minimum is strongly advised:
  • A unique identifier for the individual (obviously something that doesn’t identify the customer!)

  • The date the delete request was received

  • The date it was flagged for removal in the database

  • The date it was deleted

The metadata this table stores allows you to periodically audit your PII data to ensure no data has been inadvertently restored or the wrong data has been removed. In the latter case, Time Travel should be able to help rather than hinder!

Controlling Access to PII Data

At this stage in the process you’ve separated your PII data from your non-PII data by placing the data in separate schemas. Now you need to consider how to manage user access to this data.

For this, you’ll need to create a minimum of two access roles, one to access sensitive data and one that cannot.

You can then combine data from the two schemas using a view and grant access to query the view to both roles.

Within the view definition you can make use of data obfuscation and the CURRENT_ROLE function to determine how data should be returned to the user executing the query.

Furthermore, you can make use of secure views to prevent users viewing the DDL and inferring how they may access sensitive data. Simply defining the view as CREATE OR REPLACE SECURE VIEW as part of the VIEW definition allows you to do this. The following code illustrates how all of this can be brought together:
//ENSURE SYSADMIN IS USED
USE ROLE SYSADMIN;
//CREATE DATABASE
CREATE OR REPLACE DATABASE PII_DEMO;
//CREATE SCHEMAS
CREATE OR REPLACE SCHEMA SALES;
CREATE OR REPLACE SCHEMA CUSTOMER_PII;
CREATE OR REPLACE SCHEMA PRESENTATION;
//CREATE SALES TABLE TO STORE NON PII DATA
CREATE OR REPLACE TABLE SALES.ORDERS
(ORDER_ID INT,
 CUSTOMER_ID INT,
 ORDER_DATE TIMESTAMP_TZ);
//CREATE CUSTOMER TABLE TO STORE PII (CUSTOMER_EMAIL)
CREATE OR REPLACE TABLE CUSTOMER_PII.CUSTOMER
(CUSTOMER_ID INT,
 CUSTOMER_EMAIL VARCHAR(50));
//INSERT SOME SAMPLE RECORDS TO BOTH TABLES
INSERT INTO SALES.ORDERS
SELECT 1, 1, '2021-06-07 12:21:10.00'
UNION
SELECT 2, 1, '2021-06-10 14:21:10.00';
INSERT INTO CUSTOMER_PII.CUSTOMER
SELECT 1, '[email protected]';
//CREATE NORMAL UNSECURED VIEW
CREATE OR REPLACE VIEW PRESENTATION.CUSTOMER_LAST_ORDER_DATE
AS
SELECT C.CUSTOMER_ID,
        C.CUSTOMER_EMAIL,
        MAX(O.ORDER_DATE) AS MOST_RECENT_ORDER_DATE
FROM CUSTOMER_PII.CUSTOMER C
INNER JOIN SALES.ORDERS O on C.CUSTOMER_ID = O.CUSTOMER_ID
GROUP BY C.CUSTOMER_ID, C.CUSTOMER_EMAIL;
//INTRODUCE OBFUSCATION
CREATE OR REPLACE VIEW PRESENTATION.CUSTOMER_LAST_ORDER_DATE_PII
AS
SELECT C.CUSTOMER_ID,
        CASE WHEN CURRENT_ROLE() <> 'SENSITIVE_ALLOWED_ROLE' --NOTE THE USE OF CURRENT ROLE HERE
                THEN 'XXX-XX-XXXX' --THIS IS WHAT USERS OUTSIDE OF THE SENSITIVE_ALLOWED_ROLE WILL SEE RETURNED
                 ELSE CUSTOMER_EMAIL
            END AS CUSTOMER_EMAIL,
        MAX(O.ORDER_DATE) AS MOST_RECENT_ORDER_DATE
FROM CUSTOMER_PII.CUSTOMER C
INNER JOIN SALES.ORDERS O on C.CUSTOMER_ID = O.CUSTOMER_ID
GROUP BY C.CUSTOMER_ID, C.CUSTOMER_EMAIL;
//FINALLY TO PREVENT USERS BEING ABLE TO INFER THE UNDERLYING DATA STRUCTURES
//CREATE THE VIEW AS A SECURE VIEW
CREATE OR REPLACE SECURE VIEW PRESENTATION.CUSTOMER_LAST_ORDER_DATE_PII_SECURE
AS
SELECT C.CUSTOMER_ID,
        CASE WHEN CURRENT_ROLE() <> 'SENSITIVE_ALLOWED_ROLE' --NOTE THE USE OF CURRENT ROLE HERE
                THEN 'XXX-XX-XXXX' --THIS IS WHAT USERS OUTSIDE OF THE SENSITIVE_ALLOWED_ROLE WILL SEE RETURNED
                 ELSE CUSTOMER_EMAIL
            END AS CUSTOMER_EMAIL,
        MAX(O.ORDER_DATE) AS MOST_RECENT_ORDER_DATE
FROM CUSTOMER_PII.CUSTOMER C
INNER JOIN SALES.ORDERS O on C.CUSTOMER_ID = O.CUSTOMER_ID
GROUP BY C.CUSTOMER_ID, C.CUSTOMER_EMAIL;
//GRANT PERMISSIONS ON WAREHOUSE AND DATABASE TO SECURITYADMIN
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO SECURITYADMIN;
GRANT USAGE ON DATABASE PII_DEMO TO SECURITYADMIN;
//SWITCH TO USE SECURITYADMIN ROLE TO CARRY OUT GRANTS AND CREATE ROLES
USE ROLE SECURITYADMIN;
//CREATE NEW ROLES
CREATE OR REPLACE ROLE SENSITIVE_ALLOWED_ROLE; --PII DATA ALLOWED
CREATE OR REPLACE ROLE SENSITIVE_DENIED_ROLE; --NO ACCESS TO PII DATA
//GRANT PERMISSIONS ON OBJECTS TO BOTH ROLES
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO SENSITIVE_ALLOWED_ROLE;
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO SENSITIVE_DENIED_ROLE;
GRANT USAGE ON DATABASE PII_DEMO TO SENSITIVE_ALLOWED_ROLE;
GRANT USAGE ON DATABASE PII_DEMO TO SENSITIVE_DENIED_ROLE;
GRANT USAGE ON SCHEMA CUSTOMER_PII TO SENSITIVE_ALLOWED_ROLE;
GRANT USAGE ON SCHEMA SALES TO SENSITIVE_ALLOWED_ROLE;
GRANT USAGE ON SCHEMA PRESENTATION TO SENSITIVE_ALLOWED_ROLE;
GRANT USAGE ON SCHEMA CUSTOMER_PII TO SENSITIVE_DENIED_ROLE;
GRANT USAGE ON SCHEMA SALES TO SENSITIVE_DENIED_ROLE;
GRANT USAGE ON SCHEMA PRESENTATION TO SENSITIVE_DENIED_ROLE;
GRANT SELECT ON VIEW  PRESENTATION.CUSTOMER_LAST_ORDER_DATE TO SENSITIVE_ALLOWED_ROLE;
GRANT SELECT ON VIEW  PRESENTATION.CUSTOMER_LAST_ORDER_DATE TO SENSITIVE_DENIED_ROLE;
GRANT SELECT ON VIEW  PRESENTATION.CUSTOMER_LAST_ORDER_DATE_PII TO SENSITIVE_ALLOWED_ROLE;
GRANT SELECT ON VIEW  PRESENTATION.CUSTOMER_LAST_ORDER_DATE_PII TO SENSITIVE_DENIED_ROLE;
GRANT SELECT ON VIEW  PRESENTATION.CUSTOMER_LAST_ORDER_DATE_PII_SECURE TO SENSITIVE_ALLOWED_ROLE;
GRANT SELECT ON VIEW  PRESENTATION.CUSTOMER_LAST_ORDER_DATE_PII_SECURE TO SENSITIVE_DENIED_ROLE;
//ADD BOTH THESE ROLES TO YOUR OWN USER TO MAKE IT EASY TO TEST OUT FOR THIS DEMO
GRANT ROLE SENSITIVE_ALLOWED_ROLE TO USER AMORTS121;
GRANT ROLE SENSITIVE_DENIED_ROLE TO USER AMORTS121;
//SWITCH CONTEXT TO SYSADMIN
USE ROLE SYSADMIN;
//OBSERVE RESULTS FROM NORMAL VIEW
SELECT * FROM PRESENTATION.CUSTOMER_LAST_ORDER_DATE;
//CHANGE THE CONTEXT OF THE ROLE
USE ROLE SENSITIVE_ALLOWED_ROLE;
SELECT CURRENT_ROLE();
//OBSERVE THE FACT ALL VALUES ARE RETURNED INCLUDING CUSTOMER EMAIL
SELECT * FROM PRESENTATION.CUSTOMER_LAST_ORDER_DATE_PII;
//SWITCH THE CONTEXT OF THE ROLE TO USERS WHO CANNOT VIEW PII
USE ROLE SENSITIVE_DENIED_ROLE;
SELECT CURRENT_ROLE();
//SELECT FROM THE VIEW AGAIN AND NOTE THE VALUE OF THE CUSTOMER EMAIL IS NOW MASKED
SELECT * FROM PRESENTATION.CUSTOMER_LAST_ORDER_DATE_PII;
//VIEW THE DDL FOR THE VIEW
SELECT GET_DDL('VIEW', 'CUSTOMER_LAST_ORDER_DATE_PII_SECURE', TRUE);
//CHANGE THE CONTEXT OF THE ROLE
USE ROLE SENSITIVE_ALLOWED_ROLE;
//OBSERVE THE FACT ALL VALUES ARE RETURNED INCLUDING CUSTOMER EMAIL
SELECT * FROM PRESENTATION.CUSTOMER_LAST_ORDER_DATE_PII_SECURE;
//ADDITIONALLY TRY AND VIEW THE DDL OF THE SECURE VIEW
SELECT GET_DDL('VIEW', 'CUSTOMER_LAST_ORDER_DATE_PII_SECURE', TRUE);
//SWITCH THE CONTEXT OF THE ROLE TO USERS WHO CANNOT VIEW PII
USE ROLE SENSITIVE_DENIED_ROLE;
//SELECT FROM THE VIEW AGAIN AND NOTE THE VALUE OF THE CUSTOMER EMAIL
SELECT * FROM PRESENTATION.CUSTOMER_LAST_ORDER_DATE_PII_SECURE;
//ADDITIONALLY TRY AND VIEW THE DDL OF THE SECURE VIEW
SELECT GET_DDL('VIEW', 'CUSTOMER_LAST_ORDER_DATE_PII_SECURE', TRUE);

It’s possible to improve on this approach further using Dynamic Data Masking. Available in Enterprise Edition and above, this feature allows you to create a dynamic masking policy as an object in the database. This useful feature helps you control who can create (and modify) masking policies while centralizing the rules in one place. The masking policy is applied dynamically to any queries that reference the columns with a policy attached, so it’s very powerful. This approach promotes reuse and makes it easier if you need to adjust the logic for a PII field in the future.

You can then attach the dynamic masking policy to individual table or view columns. Working on the code example just covered, a masking policy would look like this:
//SWITCH ROLE
USE ROLE SYSADMIN;
//CREATE MASKING POLICY
CREATE OR REPLACE MASKING POLICY EMAIL_MASK AS (VAL STRING) RETURNS STRING ->
  CASE
    WHEN CURRENT_ROLE() <> ('SENSITIVE_ALLOWED_ROLE') THEN 'XXX-XX-XXXX'
    ELSE VAL
  END;
//ATTACH THE MASKING POLICY TO THE CUSTOMER_EMAIL COLUMN ON THE NORMAL VIEW WE CREATED EARLIER
ALTER VIEW PRESENTATION.CUSTOMER_LAST_ORDER_DATE MODIFY COLUMN CUSTOMER_EMAIL SET MASKING POLICY EMAIL_MASK;
//CHANGE THE CONTEXT OF THE ROLE AND OBSERVE RESULTS
USE ROLE SENSITIVE_ALLOWED_ROLE;
SELECT * FROM PRESENTATION.CUSTOMER_LAST_ORDER_DATE;
USE ROLE SENSITIVE_DENIED_ROLE;
SELECT * FROM PRESENTATION.CUSTOMER_LAST_ORDER_DATE;
//NOTE THE THE DDL IS UNCHANGED AS THE MASKING IS APPLIED AT EXECUTION TIME
SELECT GET_DDL('VIEW', 'PRESENTATION.CUSTOMER_LAST_ORDER_DATE', TRUE);

Row Access Policies

Row access policies allow you to secure data at the individual row level based on the role executing the query. They are an excellent edition to Snowflake to centralized business logic and simplify the management of fine-grained access control.

You only need to perform three simple steps to use row access policies to accomplish row-level security:
  1. 1.

    Define a policy and optionally define a mapping table.

     
  2. 2.

    Apply the policy to one or more tables.

     
  3. 3.

    Query the data.

     

Example Scenario

Imagine you have a sales table containing sales for all territories. You are working on a requirement to restrict data access so that sales managers responsible for a territory can see data only for their own territory. In this instance, you have a SALES MANAGER role who needs to see all data, a SALES EMEA role who can only see EMEA data, and a SALES_APAC role who can only see APAC data. This is an ideal use case to use a row access policy.

Steps

Step 1: Create the required database objects to store the data.
//ENSURE SYSADMIN IS USED
USE ROLE SYSADMIN;
//CREATE DATABASE
CREATE OR REPLACE DATABASE ROW_ACCESS;
USE DATABASE ROW_ACCESS;
//CREATE TABLE
CREATE OR REPLACE TABLE SALES
(ORDERNUMBER INTEGER,
QUANTITYORDERED INTEGER,
PRICEEACH INTEGER,
ORDERLINENUMBER INTEGER,
SALES INTEGER,
STATUS VARCHAR(100),
QTR_ID INTEGER,
MONTH_ID INTEGER,
YEAR_ID INTEGER,
PRODUCTLINE VARCHAR(100),
MSRP INTEGER,
PRODUCTCODE VARCHAR(100),
CUSTOMERNAME VARCHAR(100),
ADDRESSLINE1 VARCHAR(100),
ADDRESSLINE2 VARCHAR(100),
CITY VARCHAR(100),
STATE VARCHAR(100),
POSTALCODE VARCHAR(100),
COUNTRY VARCHAR(100),
TERRITORY VARCHAR(100),
CONTACTLASTNAME VARCHAR(100),
CONTACTFIRSTNAME VARCHAR(100),
DEALSIZE VARCHAR(100));
Step 2: Create the file format required to load the sample sales data.
CREATE FILE FORMAT "ROW_ACCESS"."PUBLIC".csv TYPE = 'CSV' COMPRESSION = 'AUTO' FIELD_DELIMITER = ',' RECORD_DELIMITER = ' ' SKIP_HEADER = 1

Step 3: Load the data from the sales_data_sample.csv file (provided with the book) using the web UI into the Sales table you created in step 1. Ensure you use the FF_CSV file format you created in the previous step.

Step 4: Create the row access policy. Here you return a Boolean value based on the current role and the value of the Territory column. Note you haven’t yet attached this row access policy to a table at this stage.
//CREATE ROW ACCESS POLICY
CREATE OR REPLACE ROW ACCESS POLICY SALES_TERRITORY
    AS (TERRITORY STRING) RETURNS BOOLEAN ->
    CASE    WHEN 'SALES_MANAGER' = CURRENT_ROLE() THEN TRUE
            WHEN 'SALES_EMEA' = CURRENT_ROLE() AND TERRITORY = 'EMEA' THEN TRUE
            WHEN 'SALES_APAC' = CURRENT_ROLE() AND TERRITORY = 'APAC' THEN TRUE
    ELSE FALSE
END;
Step 5: Now that you’ve created the row access policy, you can attach it to your Territory column in your Sales table.
//APPLY THE ROW ACCESS POLICY TO THE TABLE
ALTER TABLE SALES
ADD ROW ACCESS POLICY SALES_TERRITORY
    ON (TERRITORY);
Step 6: Create the requires roles and grant privileges to allow you to conduct some testing.
//GRANT PERMISSIONS ON OBJECTS TO ALL ROLES
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO SALES_MANAGER;
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO SALES_EMEA;
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO SALES_APAC;
GRANT USAGE ON DATABASE ROW_ACCESS TO SALES_MANAGER;
GRANT USAGE ON DATABASE ROW_ACCESS TO SALES_EMEA;
GRANT USAGE ON DATABASE ROW_ACCESS TO SALES_APAC;
USE ROLE SYSADMIN;
GRANT USAGE ON SCHEMA PUBLIC TO SALES_MANAGER;
GRANT USAGE ON SCHEMA PUBLIC TO SALES_EMEA;
GRANT USAGE ON SCHEMA PUBLIC TO SALES_APAC;
GRANT SELECT ON TABLE SALES TO SALES_MANAGER;
GRANT SELECT ON TABLE SALES TO SALES_EMEA;
GRANT SELECT ON TABLE SALES TO SALES_APAC;
//ADD THESE ROLES TO YOUR OWN USER TO MAKE IT EASY TO TEST OUT //FOR THIS DEMO REPLACE MYUSERNAME WITH YOUR OWN USER NAME
USE ROLE SECURITYADMIN;
GRANT ROLE SALES_MANAGER TO USER MYUSERNAME;
GRANT ROLE SALES_EMEA TO USER MYUSERNAME;
GRANT ROLE SALES_APAC TO USER MYUSERNAME;
Step 7: You can now switch between the SALES_ roles and observe the different results. By simply running a count(*) on the table you can see that the SALES_MANAGER role returns all records, whereas the SALES_EMEA and SALES_APAC roles return only records related to their respective territories.
//TEST OUT THE DIFFERENT ROLES AND OBSERVE THE RESULTS
USE ROLE SALES_MANAGER;
SELECT TERRITORY, COUNT(*)
FROM SALES
GROUP BY TERRITORY;
USE ROLE SALES_EMEA;
SELECT TERRITORY, COUNT(*)
FROM SALES
GROUP BY TERRITORY;
USE ROLE SALES_APAC;
SELECT TERRITORY, COUNT(*)
FROM SALES
GROUP BY TERRITORY;

After running through this example, you can hopefully see how powerful this could be within your own environment. I really like the simplicity and the ease of use of this feature. You can extend this approach by adding a simple mapping table between the role and allowed values. In this case, you would recreate the row access policy and look up the allowed values in the mapping table.

Advanced Snowflake Security Features

Future Grants

Future grants allow you to put in place rules that tell Snowflake how to manage privileges for certain types of objects before they are created. The aim of this option is to promote consistency across who can access what, as well as to simplify this process so you don’t have to apply privileges against each new object as you create them.

Applying a future grant doesn’t prevent you from adding additional finer-grain access controls around objects. Think of it as providing a general, standardized layer of controls across common objects.

The following example shows how you can set up a future grant in the Sales database for all new schemas to the Marketing role:
GRANT USAGE ON FUTURE SCHEMAS IN DATABASE SALES TO ROLE MARKETING;
This example creates a future grant for all tables in the Books schema to the Marketing role:
GRANT SELECT ON FUTURE TABLES IN SCHEMA SALES.BOOKS TO ROLE MARKETING;

Furthermore, a user could add a future grant on all for a role on a schema. This means you’ll be safe in knowledge that users attached to that role will always have access to that schema in the future. This means that you can create or replace objects within the schema without the need to use the COPY GRANTS command (which I cover in Chapter 8) since all of the grants will already be in place.

Managed Access Schemas

By default, when users create objects in a schema, they become the object owner for that object. This means they can grant access on this object to other roles or grant ownership privileges to other roles.

In certain cases, you may want to centralize the management of privileges to only certain roles, such as the schema owner or anyone with the MANAGE GRANTS privilege, while preventing other users from making decisions on who to grant access to. Managed access schemas allow you to control the behavior in this way. To do this, you use the WITH MANAGED ACCESS keywords which execute the CREATE SCHEMA statement.

The key takeaway here is that by applying managed access schemas, the ownership of objects is moved away from the object owner to the schema owner (or anyone with the MANAGE GRANTS privilege).

Summary

Stricter regulations and increasing scrutiny have forced organizations into taking full responsibility for the data they hold. As a result, database architecture and data management practices need to be well thought out, comprehensive, robust, and transparent to the consumer and the regulator.

Some may argue that this new focus on treating data as an asset has been a long time coming. Evidence of this was the scramble to patch up holes in numerous data solutions in time for legislation such as GDPR. Ultimately this makes companies better at safeguarding data. The companies that do this best win their customer’s trust and avoid reputational damage.

You covered a lot of ground in this chapter, from the standard roles Snowflake provides and how you can elegantly extend them to authentication for both users and applications and how you can secure and manage PII data in your data warehouse.

The aim of this chapter was to provide you with in-depth knowledge of the security features Snowflake offers and to arm you with best practice approaches to allow you to design the best possible foundation from the outset.

In this chapter, you looked at how to protect and control access to your data in Snowflake. The following chapter builds nicely upon this one. In it you’ll look at how you can protect the data itself.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset