Streams Deep Dive + Source Documentation

What is Streams?

Streams is the engine which handles delivering data from Yext Source Systems (primarily Knowledge Graph) to Yext’s consumer-facing applications (such as Answers, and Pages). Streams enables powerful use cases, like traversing relationships across entities!

Each system can independently determine which components of Stream configuration they wish to expose to the user. For example, Streams Endpoints, exposes a property called stream which contains all of the configuration for a Stream. However, Answers handles Stream configuration implicitly, under-the-hood; a user only needs to interact with their Answers configuration, and the Answers system will automatically generate a Stream with the relevant information.

One crucial point is that the Stream configuration is always handled in the context of the downstream application; users will never configure a Stream independently of a downstream system.

What Makes Up a Stream?

As alluded to above, in some Yext systems, users will need to define a Stream so that the application they are configuring can access relevant data. Currently, users define Streams explicitly when configuring:

When configuring a Stream for these systems, it’s important to understand what the various properties mean!

A Stream is primarily composed of the following properties:

Property Description Accepted Values
Source The source system from which the Stream will fetch data. The source defines the type of records which Streams will produce.

For example, if the source is Knowledge Graph, Streams will produce a record per entity. If the source is Reviews, Streams will produce a record per review.

In most cases, the source of a Stream will be Knowledge Graph.
  • knowledgeGraph
  • reviews
  • reviewsAgg
Filter Any filter which should be applied to the data streamed from the source.

Valid filters vary based on the selected source. See the source-specific sections below.
Knowledge Graph
  • savedFilterIds
  • entityTypes
Reviews
  • publisher
ReviewsAgg
  • publisher
Localization Only relevant for Knowledge Graph. The list of localization codes to stream records for. By default, the primary locale of the entities will be included in Streams.

For example, if I wished to receive the Spanish localized version my entities, I would specify “locales” = [“es”] in my localization property.
Any valid BCP-47 Locale identifier.
Transform A list of valid transforms. Any of:
  • A valid JSONPatch transform

The following built-in transforms:
  • expandOptionFields
  • replaceOptionValuesWithDisplayNames
Fields The set of fields which should be included in the Stream. See the fields section below!

Configuring a Stream

Based on the description above, the general guidelines when configuring a Stream are:

  1. Choose a source (typically KG)
  2. Choose a filter (based on your source)
  3. Select the fields you wish to include in your Stream
  4. Optionally, specify any localization behavior
  5. Optionally, specify any transforms

As you can see above, the first step is always to determine the correct Streams source. In almost all cases, your source should be Knowledge Graph, since KG is where all of your entity data is stored.

Once you select your source, you can choose your filter and fields to Stream. Since both filters and fields are source-specific, they will be covered separately in the relevant sections below.

Knowledge Graph

As mentioned above, the most common Streams Source is the Knowledge Graph! Let’s understand the Filter and Fields properties for KG.

With the Knowledge Graph source, the records being passed into Streams are Entities, meaning there will be one entity passed into Streams as an input for every entity which matches the filter.

Filter

With the Knowledge Graph source, the accepted filter types are:

  • savedFilterIds
  • entityTypes

Filters of the same type will be OR’d together, but different entityType and savedFilterIds will be AND’d together. For example, for a Stream with

  • entityTypes = [“location”, “healthcareProfessional”] and savedFilterIds = [“123”, “456”]

the logic would be:

  • [(EntityType in [“location”, “healthcareProfessional”]) AND (savedFilter in [“123”, “456”])

In most cases, it is best practice to filter down to a single entity type, either using the explicit entityType filter or by doing so in your Saved Filter. Why is that the case? In order to understand this, we need to better understand how field selection works in Streams.

Source vs. Referenced Entities

The filter only applies to the base records processed by Streams; this means that entities which do not match the filter can still be accessed when traversing relationships. However, this relationship data will always be produced in the context of the base record(s).

For example, if my Stream was filtered only to Healthcare Professional entity types, I could still access data across linked condition entity types, and linked facilities where the doctor worked. However, Streams would only produce outputs (unique records) for Healthcare Professionals, and the data from the other types would be in the context of each Healthcare Professional.

Fields

The fields property is used to define the list of fields from the source records (see: Entities) which are included in the Stream. These fields are accessed using their Knowledge Graph Field IDs, which are the External IDs for fields across Yext APIs and Configuration as Code.

Specifying fields for the KG source is more complicated than other sources, since the set of fields can vary drastically across entity types and accounts.

So, why do we recommend filtering down to a single Entity Type in most cases when leveraging the KG source? When defining fields, you are selecting the fields from the source entity which should be included in the Stream. An Entity Type contains a set of fields, and all entities of that type will have the fields defined in the Entity Type Schema.

For example, let’s say I have a custom field for available colors with the id c_availableColors, and it is only enabled on my product entity type. I would add c_availableColors to the fields array. However, if that Stream contained both product entities and location entities, there would be no data for that c_availableColors field on my location entities included in the Stream. In most cases, it will be much simpler to ensure that each Stream only contains a single entity type.

Accessing Data Across Entity Relationships

Familiarizing oneself with entity type schema is even more important when you wish to access data across relationships.

In KG, relationships are stored in fields. For example, I might have a field on my healthcareProfessional entity type called c_worksAt, which is a relationship (entity reference) field. In that field, I would store a pointer to all of the healthcareFacility entity types which a given doctor works at.

We access data across relationships using dot notation. A user specifies the field that the relationship is stored in, and the field on the entity across the relationship, with a period as the delimiter. For example, if I want to access the names of the Facilities which a doctor works at in the example above, I would use the syntax c_worksAt.name. Again, in this example, the c_worksAt field is part of the schema of the healthcareProfessional entityType, and the fields accessed across the relationship are part of the schema of the healthcareFacility entityType.

Streams-Specific KG Fields

There are a number of fields which can be accessed from the Knowledge Graph source which are not a part of the specific entity schema. These fields can be included for any Stream from the KG source.

Field Type Description
uid integer The Entity UID. This UID is generated by Yext and is globally unique. It is not editable by users. This UID is the primary key for the Knowledge Graph source, meaning you can use this ID for a Get by ID API request on a Streams Endpoint with the Knowledge Graph Source.
id string The external Entity ID. This ID is editable by users in Knowledge Graph. It is unique within a single account.
meta object An object containing specific metadata about the entity.
ref_listings Array of ref_listings objects An object containing data about the entity’s listings on certain publishers.

You must specify individual sub-fields of the ref_listings object - supplying just the ref_listings field will not return readable data.
ref_reviewsAgg Array of ref_reviewsAgg objects An object containing data about the entity’s reviews aggregate data (average rating and review count) on certain publishers.

You must specify individual sub-fields of the ref_reviewsAgg object - supplying just the ref_reviewsAgg field will not return readable data.

All other Knowledge Graph fields are available from the Knowledge Graph Streams source. These fields should be referenced using the Field ID (fka API Name).|

meta object

Field Type Description
locale string The Knowledge Graph locale code of the specific profile.
entityType string The Knowledge Graph entity type of the entity.
updateTimestamp string The timestamp of the most recent change to this entity record.

ref_listings object

Field Type Description
uid string The UID of the specific listing. Constructed as a combination of the publisher-entityUid pair.
publisher string The publisher of the listing.
listingUrl string The URL of the listing

ref_reviewsAgg object

Field Type Description
reviewsAggUid string The UID of the specific reviewsAgg object. Constructed as a combination of the publisher-entityUid pair.
publisher string The publisher which the review is associated with.
averageRating number The average rating of the entity on the publisher.
reviewCount number The number of reviews for the entity on the publisher.

Reviews

On the Reviews Source, each record is an individual review. This source is useful if you want to, for example, fetch reviews from Streams API to publish in a consumer-facing experience like a website or mobile app.

Filter

The only supported filter for Reviews is Publisher. There are 4 accepted values for the publisher filter:

  • googlemybusiness
  • facebook
  • firstparty
  • externalfirstparty

Streams includes any reviews which match the publisher criteria, and also the following criteria:

  • Review Status = LIVE (excluding QUARANTINED & REMOVED reviews)

Fields

The Review object has a predefined data model. The following fields are available from the Reviews source.

Field Type Description
reviewId integer The ID of the review. This ID is generated by Yext and is globally unique. It is not editable by users.

This ID is the primary key for the reviews source, meaning you can use this ID for a Get by ID API request on a Streams Endpoint with the Reviews Source.
entity Object (reference) This field is used to reference data from the entity which the review is for. Using dot notation, a user can specify fields from the entity to include on the review document, for example, entity.name.
publisher string The publisher which the review is associated with.
authorName string The name of the person who wrote the review.
content string The content of the review.
reviewUrl string The public URL where the review can be found
rating number Normalized rating out of 5.
reviewDate date-time The date the review was posted, according to the publisher.

Note: certain publishers update the reviewDate when a review is updated by the reviewer.
lastYextUpdateDate date-time The most recent of the reviewDate and the date of the last response.
reviewLabels Array of reviewLabnels objects An object containing information about the labels on the review
comments Array of comments objects An object containing information about the responses for the review.
apiIdentifier string A unique identifier for this review. This value is determined in the following manner:
  • A UUID generated at the time the Review Creation request is accepted
  • The invitationUid, if the review is associated with an invitation.

reviewLabels object

Field Type Description
uid integer The UID of the specific reviewLabel object.
name string The name of the label.

comments object

Field Type Description
commentId integer The unique ID of the comment.
commentDate date-time The date the comment was posted.
authorName string The name of the author who wrote the comment.
content string The content of the comment.

ReviewsAgg

On the ReviewsAgg Source, there is a record for each Reviews Aggregate Data Object (average star rating and review count); this data exists at the level of the entity-publisher pair. This source is useful if you want to, for example, display average rating and/or review count in a consumer-facing experience like a website or mobile app.

Filter

The only supported filter for ReviewsAgg is Publisher. There are 4 accepted values for the publisher filter:

  • googlemybusiness
  • facebook
  • firstparty
  • externalfirstparty

Fields

The ReviewsAgg object has a predefined data model. The following fields are available from the ReviewsAgg source.

Field Type Description
reviewsAggUid string The UID of the specific reviewsAgg object. Constructed as a combination of the publisher-entityUid pair.
entity Object (reference) This field is used to reference data from the entity which the review is for. Using dot notation, a user can specify fields from the entity to include on the review document, for example, entity.name.
publisher string The publisher which the review is associated with.
averageRating number The average rating of the entity on the publisher.
reviewCount number The number of reviews for the entity on the publisher.
3 Likes