Security and Data Sensitivity in Data Sharing | Yext Hitchhikers Platform

Overview

This document explains how Yext Data Sharing handles security and sensitive data.

First, as a reminder, Yext uses Snowflake’s Data Sharing feature, which implements several measures to ensure a high level of security. Here are some of the highlights:

  1. Read-Only Access: The shared database objects between accounts are read-only. This means that the shared objects cannot be modified or deleted, including adding or modifying table data​.
  2. No Data Movement: With Secure Data Sharing, no actual data is copied or transferred between accounts. All sharing uses Snowflake’s services layer and metadata store. This approach eliminates synchronization issues because the data remains at rest and encrypted at its original location​. It also removes the security risk brought by copying or moving data from one place to another.
  3. Role-Based Access Control: On the consumer side, a read-only database is created from the share. Access to this database is configurable using the same, standard role-based access control that Snowflake provides for all objects in the system​.
  4. Industry Compliance: Snowflake’s processes and procedures meet industry certifications like PCI DSS and HIPAA.

For more detail, check out Snowflake’s Introduction to Secure Data Sharing documentation.

On top of all the above, Yext applies additional mechanisms to provide even greater levels of security and data sensitivity. There are two main ways we do this – removing archived data in the secure view definition, and applying row-level security.

Removing Archived Data

Before we add a secure view to the Snowflake share, we filter out any archived data. These archived items could be entities, folders, etc. Let’s walk through a quick example of how this is done.

Say you want to create a secure view on top of the table database.schema.my_table, seen below:

Business ID Entity ID Entity Display Name Archived Archived Timestamp
123456 12345 Pizza Planet (Chelsea) FALSE NULL
123456 23456 Pizza Emporium FALSE NULL
123456 91210 Pizza Planet (West Village) TRUE 2023-05-24 20:03:39.412
123456 94025 Pizza Palace) FALSE NULL

Pizza Planet (West Village) is an archived entity that we want to filter out from our secure view. We know it’s an archived entity because the archived is TRUE and it has a non-null archived_timestamp.

When we define the secure view to be added to the share, we include a filter condition for the archived and archived_timestamp columns to ensure that any entities that are added to the view are active entities.

Below is an example of how we might define this secure view:

create secure view my_secure_view as (
  select
    business_id,
    entity_id,
    entity_display_name
  from database.schema.my_table
  -- Filtering out archived entities
  where not (archived or archived_timestamp is null)
)

If you queried all rows of my_secure_view:

select 
   business_id,
   entity_id
   entity_display_name
from my_secure_view

The following data would be returned:

Business ID Entity ID Entity Display Name
123456 12345 Pizza Planet (Chelsea)
123456 23456 Pizza Emporium
123456 94025 Pizza Palace

Notice that the Pizza Planet (West Village) entity is not returned because it was filtered out during the view definition.

Additionally, we apply a one-year lookback period on most of the secure views. The exceptions to this rule are:

  • API Requests and Consumer API Requests – We apply a 30-day lookback to be consistent with what we show in the platform
  • Any dimension views – Any views that are primarily used as dimensions or for joining purposes, such as entities, businesses, folders, etc. do not require a defined lookback period. Instead, we show all active (unarchived) records (as detailed above).

Finally, for a view to be added to the data share, it must go through our code review process.

This ensures that views are not uploaded without the approval of at least one engineer who is familiar with the underlying tables and what data should and should not be shared externally.

Row Level Security

The way that Yext Data Sharing works is that there is one global sharing database where all of the secure views are uploaded to.

As a result, Yext Data Sharing must ensure that you only see the data that belongs to your business.

To accomplish this, Yext uses row-level security, more specifically Snowflake’s row access policy feature, to limit the data your Snowflake account can access to only the rows that belong to your business. How is this done?

First, we’ve created an internal mapping table which maps your Snowflake account locator to a set of Yext business IDs. This table tells us which Yext business IDs your Snowflake account should have access to.

Next, before we add each secure view to the global sharing database, we use this mapping table to define a row access policy for that view. This basically tells Snowflake that, for a given secure view, only give a Snowflake account access to the rows that correspond to the Yext business ID(s) associated with it in the mapping table.

Once we’ve applied this row access policy to each Data Sharing consumer, we can add the secure view.

The graphic below summarizes how row-level security works in Yext Data Sharing:

data sharing diagram