What is Streams?
Streams is the engine which handles delivering data from Yext Source Systems (primarily Knowledge Graph) to Yext’s consumer-facing applications (such as Answers, and Pages). Streams enables powerful use cases, like traversing relationships across entities!
Each system can independently determine which components of Stream configuration they wish to expose to the user. For example, Streams Endpoints, exposes a property called stream which contains all of the configuration for a Stream. However, Answers handles Stream configuration implicitly, under-the-hood; a user only needs to interact with their Answers configuration, and the Answers system will automatically generate a Stream with the relevant information.
One crucial point is that the Stream configuration is always handled in the context of the downstream application; users will never configure a Stream independently of a downstream system.
What Makes Up a Stream?
As alluded to above, in some Yext systems, users will need to define a Stream so that the application they are configuring can access relevant data. Currently, users define Streams explicitly when configuring:
- Streams Endpoints (Schema Reference)
- Sites (Stream per Template)
When configuring a Stream for these systems, it’s important to understand what the various properties mean!
A Stream is primarily composed of the following properties:
Property | Description | Accepted Values |
---|---|---|
Source | The source system from which the Stream will fetch data. The source defines the type of records which Streams will produce. For example, if the source is Knowledge Graph, Streams will produce a record per entity. If the source is Reviews, Streams will produce a record per review. In most cases, the source of a Stream will be Knowledge Graph. |
|
Filter | Any filter which should be applied to the data streamed from the source. Valid filters vary based on the selected source. See the source-specific sections below. |
Knowledge Graph
|
Localization | Only relevant for Knowledge Graph. The list of localization codes to stream records for. By default, the primary locale of the entities will be included in Streams. For example, if I wished to receive the Spanish localized version my entities, I would specify “locales” = [“es”] in my localization property. |
Any valid BCP-47 Locale identifier. |
Transform | A list of valid transforms. | Any of:
The following built-in transforms:
|
Fields | The set of fields which should be included in the Stream. | See the fields section below! |
Configuring a Stream
Based on the description above, the general guidelines when configuring a Stream are:
- Choose a source (typically KG)
- Choose a filter (based on your source)
- Select the fields you wish to include in your Stream
- Optionally, specify any localization behavior
- Optionally, specify any transforms
As you can see above, the first step is always to determine the correct Streams source. In almost all cases, your source should be Knowledge Graph, since KG is where all of your entity data is stored.
Once you select your source, you can choose your filter and fields to Stream. Since both filters and fields are source-specific, they will be covered separately in the relevant sections below.
Knowledge Graph
As mentioned above, the most common Streams Source is the Knowledge Graph! Let’s understand the Filter and Fields properties for KG.
With the Knowledge Graph source, the records being passed into Streams are Entities, meaning there will be one entity passed into Streams as an input for every entity which matches the filter.
Filter
With the Knowledge Graph source, the accepted filter types are:
- savedFilterIds
- entityTypes
Filters of the same type will be OR’d together, but different entityType and savedFilterIds will be AND’d together. For example, for a Stream with
- entityTypes = [“location”, “healthcareProfessional”] and savedFilterIds = [“123”, “456”]
the logic would be:
- [(EntityType in [“location”, “healthcareProfessional”]) AND (savedFilter in [“123”, “456”])
In most cases, it is best practice to filter down to a single entity type, either using the explicit entityType filter or by doing so in your Saved Filter. Why is that the case? In order to understand this, we need to better understand how field selection works in Streams.
Source vs. Referenced Entities
The filter only applies to the base records processed by Streams; this means that entities which do not match the filter can still be accessed when traversing relationships. However, this relationship data will always be produced in the context of the base record(s).
For example, if my Stream was filtered only to Healthcare Professional entity types, I could still access data across linked condition entity types, and linked facilities where the doctor worked. However, Streams would only produce outputs (unique records) for Healthcare Professionals, and the data from the other types would be in the context of each Healthcare Professional.
Fields
The fields property is used to define the list of fields from the source records (see: Entities) which are included in the Stream. These fields are accessed using their Knowledge Graph Field IDs, which are the External IDs for fields across Yext APIs and Configuration as Code.
Specifying fields for the KG source is more complicated than other sources, since the set of fields can vary drastically across entity types and accounts.
So, why do we recommend filtering down to a single Entity Type in most cases when leveraging the KG source? When defining fields, you are selecting the fields from the source entity which should be included in the Stream. An Entity Type contains a set of fields, and all entities of that type will have the fields defined in the Entity Type Schema.
For example, let’s say I have a custom field for available colors with the id c_availableColors, and it is only enabled on my product entity type. I would add c_availableColors to the fields array. However, if that Stream contained both product entities and location entities, there would be no data for that c_availableColors field on my location entities included in the Stream. In most cases, it will be much simpler to ensure that each Stream only contains a single entity type.
Accessing Data Across Entity Relationships
Familiarizing oneself with entity type schema is even more important when you wish to access data across relationships.
In KG, relationships are stored in fields. For example, I might have a field on my healthcareProfessional entity type called c_worksAt, which is a relationship (entity reference) field. In that field, I would store a pointer to all of the healthcareFacility entity types which a given doctor works at.
We access data across relationships using dot notation. A user specifies the field that the relationship is stored in, and the field on the entity across the relationship, with a period as the delimiter. For example, if I want to access the names of the Facilities which a doctor works at in the example above, I would use the syntax c_worksAt.name. Again, in this example, the c_worksAt field is part of the schema of the healthcareProfessional entityType, and the fields accessed across the relationship are part of the schema of the healthcareFacility entityType.
Streams-Specific KG Fields
There are a number of fields which can be accessed from the Knowledge Graph source which are not a part of the specific entity schema. These fields can be included for any Stream from the KG source.
Field | Type | Description |
---|---|---|
uid | integer | The Entity UID. This UID is generated by Yext and is globally unique. It is not editable by users. This UID is the primary key for the Knowledge Graph source, meaning you can use this ID for a Get by ID API request on a Streams Endpoint with the Knowledge Graph Source. |
id | string | The external Entity ID. This ID is editable by users in Knowledge Graph. It is unique within a single account. |
meta | object | An object containing specific metadata about the entity. |
ref_listings | Array of ref_listings objects | An object containing data about the entity’s listings on certain publishers. You must specify individual sub-fields of the ref_listings object - supplying just the ref_listings field will not return readable data. |
ref_reviewsAgg | Array of ref_reviewsAgg objects | An object containing data about the entity’s reviews aggregate data (average rating and review count) on certain publishers. You must specify individual sub-fields of the ref_reviewsAgg object - supplying just the ref_reviewsAgg field will not return readable data. |
All other Knowledge Graph fields are available from the Knowledge Graph Streams source. These fields should be referenced using the Field ID (fka API Name).|
meta object
Field | Type | Description |
---|---|---|
locale | string | The Knowledge Graph locale code of the specific profile. |
entityType | string | The Knowledge Graph entity type of the entity. |
updateTimestamp | string | The timestamp of the most recent change to this entity record. |
ref_listings object
Field | Type | Description |
---|---|---|
uid | string | The UID of the specific listing. Constructed as a combination of the publisher-entityUid pair. |
publisher | string | The publisher of the listing. |
listingUrl | string | The URL of the listing |
ref_reviewsAgg object
Field | Type | Description |
---|---|---|
reviewsAggUid | string | The UID of the specific reviewsAgg object. Constructed as a combination of the publisher-entityUid pair. |
publisher | string | The publisher which the review is associated with. |
averageRating | number | The average rating of the entity on the publisher. |
reviewCount | number | The number of reviews for the entity on the publisher. |
Reviews
On the Reviews Source, each record is an individual review. This source is useful if you want to, for example, fetch reviews from Streams API to publish in a consumer-facing experience like a website or mobile app.
Filter
The only supported filter for Reviews is Publisher. There are 4 accepted values for the publisher filter:
- googlemybusiness
- firstparty
- externalfirstparty
Streams includes any reviews which match the publisher criteria, and also the following criteria:
- Review Status = LIVE (excluding QUARANTINED & REMOVED reviews)
Fields
The Review object has a predefined data model. The following fields are available from the Reviews source.
Field | Type | Description |
---|---|---|
reviewId | integer | The ID of the review. This ID is generated by Yext and is globally unique. It is not editable by users. This ID is the primary key for the reviews source, meaning you can use this ID for a Get by ID API request on a Streams Endpoint with the Reviews Source. |
entity | Object (reference) | This field is used to reference data from the entity which the review is for. Using dot notation, a user can specify fields from the entity to include on the review document, for example, entity.name. |
publisher | string | The publisher which the review is associated with. |
authorName | string | The name of the person who wrote the review. |
content | string | The content of the review. |
reviewUrl | string | The public URL where the review can be found |
rating | number | Normalized rating out of 5. |
reviewDate | date-time | The date the review was posted, according to the publisher. Note: certain publishers update the reviewDate when a review is updated by the reviewer. |
lastYextUpdateDate | date-time | The most recent of the reviewDate and the date of the last response. |
reviewLabels | Array of reviewLabnels objects | An object containing information about the labels on the review |
comments | Array of comments objects | An object containing information about the responses for the review. |
apiIdentifier | string | A unique identifier for this review. This value is determined in the following manner:
|
reviewLabels object
Field | Type | Description |
---|---|---|
uid | integer | The UID of the specific reviewLabel object. |
name | string | The name of the label. |
comments object
Field | Type | Description |
---|---|---|
commentId | integer | The unique ID of the comment. |
commentDate | date-time | The date the comment was posted. |
authorName | string | The name of the author who wrote the comment. |
content | string | The content of the comment. |
ReviewsAgg
On the ReviewsAgg Source, there is a record for each Reviews Aggregate Data Object (average star rating and review count); this data exists at the level of the entity-publisher pair. This source is useful if you want to, for example, display average rating and/or review count in a consumer-facing experience like a website or mobile app.
Filter
The only supported filter for ReviewsAgg is Publisher. There are 4 accepted values for the publisher filter:
- googlemybusiness
- firstparty
- externalfirstparty
Fields
The ReviewsAgg object has a predefined data model. The following fields are available from the ReviewsAgg source.
Field | Type | Description |
---|---|---|
reviewsAggUid | string | The UID of the specific reviewsAgg object. Constructed as a combination of the publisher-entityUid pair. |
entity | Object (reference) | This field is used to reference data from the entity which the review is for. Using dot notation, a user can specify fields from the entity to include on the review document, for example, entity.name. |
publisher | string | The publisher which the review is associated with. |
averageRating | number | The average rating of the entity on the publisher. |
reviewCount | number | The number of reviews for the entity on the publisher. |