Security and Data Sensitivity in Data Sharing | Yext Hitchhikers Platform
This document explains how Yext Data Sharing handles security and sensitive data.
First, as a reminder, Yext uses Snowflake’s Data Sharing feature, which implements several measures to ensure a high level of security. Here are some of the highlights:
- Read-Only Access: The shared database objects between accounts are read-only. This means that the shared objects cannot be modified or deleted, including adding or modifying table data.
- No Data Movement: With Secure Data Sharing, no actual data is copied or transferred between accounts. All sharing uses Snowflake’s services layer and metadata store. This approach eliminates synchronization issues because the data remains at rest and encrypted at its original location. It also removes the security risk brought by copying or moving data from one place to another.
- Role-Based Access Control: On the consumer side, a read-only database is created from the share. Access to this database is configurable using the same, standard role-based access control that Snowflake provides for all objects in the system.
- Industry Compliance: Snowflake’s processes and procedures meet industry certifications like PCI DSS and HIPAA.
On top of all the above, Yext applies additional mechanisms to provide even greater levels of security and data sensitivity. There are two main ways we do this – removing archived data in the secure view definition, and applying row-level security.
Removing Archived Data
Before we add a secure view to the Snowflake share, we filter out any archived data. These archived items could be entities, folders, etc. Let’s walk through a quick example of how this is done.
Say you want to create a secure view on top of the
table database.schema.my_table, seen below:
|Business ID||Entity ID||Entity Display Name||Archived||Archived Timestamp|
|123456||12345||Pizza Planet (Chelsea)||FALSE||NULL|
|123456||91210||Pizza Planet (West Village)||TRUE||2023-05-24 20:03:39.412|
Pizza Planet (West Village) is an archived entity that we want to filter out from our secure view. We know it’s an archived entity because the
archived is TRUE and it has a non-null
When we define the secure view to be added to the share, we include a filter condition for the
archived_timestamp columns to ensure that any entities that are added to the view are active entities.
Below is an example of how we might define this secure view:
create secure view my_secure_view as ( select business_id, entity_id, entity_display_name from database.schema.my_table -- Filtering out archived entities where not (archived or archived_timestamp is null) )
If you queried all rows of
select business_id, entity_id entity_display_name from my_secure_view
The following data would be returned:
|Business ID||Entity ID||Entity Display Name|
|123456||12345||Pizza Planet (Chelsea)|
Notice that the Pizza Planet (West Village) entity is not returned because it was filtered out during the view definition.
Additionally, we apply a one-year lookback period on most of the secure views. The exceptions to this rule are:
- API Requests and Consumer API Requests – We apply a 30-day lookback to be consistent with what we show in the platform
- Any dimension views – Any views that are primarily used as dimensions or for joining purposes, such as
folders, etc. do not require a defined lookback period. Instead, we show all active (unarchived) records (as detailed above).
Finally, for a view to be added to the data share, it must go through our code review process.
This ensures that views are not uploaded without the approval of at least one engineer who is familiar with the underlying tables and what data should and should not be shared externally.
Row Level Security
The way that Yext Data Sharing works is that there is one global sharing database where all of the secure views are uploaded to.
As a result, Yext Data Sharing must ensure that you only see the data that belongs to your business.
To accomplish this, Yext uses row-level security, more specifically Snowflake’s feature, to limit the data your Snowflake account can access to only the rows that belong to your business. How is this done?
First, we’ve created an internal mapping table which maps your Snowflake account locator to a set of Yext business IDs. This table tells us which Yext business IDs your Snowflake account should have access to.
Next, before we add each secure view to the global sharing database, we use this mapping table to define a row access policy for that view. This basically tells Snowflake that, for a given secure view, only give a Snowflake account access to the rows that correspond to the Yext business ID(s) associated with it in the mapping table.
Once we’ve applied this row access policy to each Data Sharing consumer, we can add the secure view.
The graphic below summarizes how row-level security works in Yext Data Sharing: