Streams Overview| Hitchhikers Platform

Overview

Streams is the engine which handles delivering data from Yext Source Systems (primarily Knowledge Graph) to Yext’s consumer-facing applications (such as Answers, and Pages). Streams enables powerful use cases, like traversing relationships across entities!

In order to leverage the capabilities of Streams in external applications, a user must define a Content Endpoint. The user can then fetch data from this endpoint via the Content API, or create a Content Webhook based on this endpoint.

Each downstream system may choose to expose Stream Configuration slightly differently to the end-user. For example, Content Endpoints & Pages both expose a property which contains all of the configuration for a Stream. However, Search handles Stream configuration implicitly, under-the-hood; a user only needs to interact with their Search configuration, and the Search system will automatically generate a Stream with the relevant information.

One crucial point is that the Stream configuration is always handled in the context of the downstream application; users will never configure a Stream independently of a downstream system.

Relevant Terminology

Name Definition
Content Endpoint The configuration unit which provides access to the Content API and Content Webhooks.

Consists of a Stream definition and a subset of fields which are indexed for querying via the Content API.
Content API The consumer-grade API which is used to make requests to Content Endpoints.
Content Webhook Webhooks which are configured to send updates when documents included in a Content Endpoint definition are updated.
Stream The core unit of configuration which includes a set of data (source, filter, fields, consumer).

When initialized, the entire set of data which matches the Stream Definition is sent to the relevant consumer. Subsequent records are then sent for updates to records or new records which match the Stream definition.


What Makes Up a Stream?

As alluded to above, in some Yext systems, users will need to define a Stream so that the application they are configuring can access relevant data. Currently, users define Streams explicitly when configuring:

  • Content Endpoints (Schema Reference 2)
  • Sites (Stream per Template)

When configuring a Stream for these systems, it’s important to understand what the various properties mean!

A stream is composed of the following properties:

Property Description Accepted Values
Source The source system from which the Stream will fetch data. The source defines the type of records which Content Endpoint will produce.

For example, if the source is Knowledge Graph, the Content Endpoint will produce a record per entity. If the source is Reviews, the Content Endpoint will produce a record per review.

In most cases, the source of a Content Endpoint will be Knowledge Graph.
Knowledge Graph
  • Id: knowledgeGraph
  • entityTypes
Reviews
  • Id: reviews
ReviewsAgg
  • Id: reviewsAgg
Filter Any filter which should be applied to the data streamed from the source.

Valid filters vary based on the selected source. See the source-specific sections below.
Knowledge Graph
  • Saved Filters — Property (CaC): savedFilterIds
  • Entity Types — Property (CaC): entityTypes
Reviews
  • Publishers — Property (CaC): publisher
ReviewsAgg
  • Publishers — Property (CaC): publisher
Localization Only relevant for Knowledge Graph. The list of localization codes to stream records for. By default, the primary locale of the entities will be included in the Content Endpoint.

For example, if I wished to receive the Spanish localized version my entities, I would specify “locales” = [“es”] in my localization property.
Any valid BCP-47 Locale identifier.
Transform A list of valid transforms. Any of:
  • A valid JSONPatch transform

The following built-in transforms:
  • expandOptionFields
  • replaceOptionValuesWithDisplayNames
Fields The set of fields which should be included in the Stream. See the fields section below!

Content API Details

The base URLs for Content API in our various environments are:


As with most other Yext endpoints intended for reading data, there is a Get by ID and List version of requests to Content Endpoints.

As with all Yext APIs, all requests are required to include a v_param and an api_key as path parameters.

Get by ID

Get by ID requests will require an ID of the primary key, which varies depending on the source. The primary keys for each source are documented below:

Source Primary Key
Knowledge Graph Entity UID (not Entity ID)
Reviews Review ID
ReviewsAgg ReviewsAggUid (constructed as {entityUid}-{publisherId})

Get By ID requests are more performant than the standard LIST request. The requester can also specify multiple IDs in a single Get By ID request, separating the IDs using semicolons. For example:

GET https://streams.yext.com/v2/accounts/me/api/exampleEndpoint/id1;id2;id3?api_key={api_key}&v=20200408

List

The Content API provides the ability for a user to filter and sort records by the contents of an indexed field. The filters and sorting parameters can be provided as query parameters, in addition to the query parameters documented here. If no filters are provided, the API returns all the records (with pagination).

For example:

GET https://streams.yext.com/v2/accounts/me/api/exampleEndpoint/?api_key={api_key}&v=2020040&name=EntityA

only records where the field name has value EntityA will be returned.

Additionally, the Content API can support OR logic on a single field, allowing a requester to pass multiple values for a single field. For example, if you only wanted entities where name=EntityA OR name=EntityB, the request would be:

GET https://streams.yext.com/v2/accounts/me/api/exampleEndpoint/?api_key={api_key}&v=2020040&name__in=EntityA&name__in=EntityB

Filters

Filters are supported on Text, Numeric, Date and DateTime fields, although not all filters are supported by all fields.

It is also possible to filter on an array of strings, or a string field nested in an array of objects. In this case, the record will be present in the response if one of the field values in the array matches the filter.

For example, consider the following records produced for a Content Endpoint:

[
   {
      "uid":123,
      "c_listOfColors":[
         "red",
         "blue",
         "green"
      ]
   },
   {
      "id":456,
      "c_listOfColors":[
         "red",
         "yellow"
      ]
   }
]

If you made a request to this endpoint and filtered for c_listOfColors=blue, only the record with uid=123 would be returned, since the c_listOfColors array on that record contains “blue”.

Fields from related entities are accessed via a projection (using dot notation) will be made available in an array of objects.

For example, in the following record produced for a Content Endpoint, where c_linkedVariants is an entity list field. If you made a request to this endpoint and filtered for c_linkedVariants.size=32oz, only the record with uid=234 would be returned, since the c_linkedVariants array on that record contains the value “32oz” in the “size” field.

[
   {
      "uid":234,
      "name":"Product A",
      "c_linkedVariants":[
         {
            "size":"32oz",
            "color":"Blue"
         },
         {
            "size":"40oz",
            "color":"Yellow"
         }
      ]
   },
   {
      "uid":234,
      "name":"Product A",
      "c_linkedVariants":[
         {
            "size":"20oz",
            "color":"Red"
         },
         {
            "size":"48oz",
            "color":"Orange"
         }
      ]
   }
]

If you made a request to this endpoint and filtered for c_linkedVariants.size=32oz, only the record with uid=234 would be returned, since the c_linkedVariants array on that record contains the value “32oz” in the “size” field.

However, the Content API is not able to ensure that multiple values in a single object match a filter criteria; it is only evaluating whether the values exist in the array at all, not limited to within a single object. For example, if I filtered to c_linkedVariants.size=32oz&c_linkedVariants.color=Yellow, the record with uid=234 would still be returned. Technically, both these criteria are met within the record, despite the fact that the criteria are not both met within a single object. In human-readable terms, this means, although there is not one single variant which is 32oz and Yellow, since the variants in this array match this criteria separately, both conditions are evaluated as true for this record.

Feedback