If you’re already a Hitchhiker, log in to access this content.
Extract — Source, Source Settings & Selectors| Hitchhikers Platform
What You’ll Learn
In this section, you will learn:
- An overview of the available Connector sources
- How to configure source settings to pull data from the correct endpoint
- How to specify selectors to pull in the relevant data components
The first step to create a Connector is selecting your desired Connector Source. Then you will select the entities you want to create or update with this Connector.
In this Data Connectors framework we have a variety of sources including:
- Third-party apps like Zendesk and ServiceNow
- The Yext Crawler
These data origins are all sources.
Any App partner that can pull data into the Knowledge Graph is considered a Source in the Data Connectors framework. This includes apps like Google My Business, Shopify, Magento, Zendesk, and ServiceNow.
When you select one of these Data Sources from the Add Data screen, you will be prompted to OAuth into your account and grant Yext access to the relevant data.
The primary step you will take is to designate which account or dataset within the account you want to pull from.
This will vary by app, for example:
- For the Google My Business App, you will enter the Location Group where you store the locations you want to sync into Yext.
- For the Zendesk App, you will enter your Zendesk subdomain.
- For Shopify, you will map the ID field, and select the Product Status.
- For Magento, you will enter your Magento API Access Token, Host Name, and Product SKU Mapping.
Then, the locations, articles, or products will automatically begin loading in as entities in the Knowledge Graph. If you do not see entities populate into the Knowledge Graph, or you feel like the process is taking longer than it should, we recommend emailing Yext Support for assistance.
Because these are pre-built integrations, you will not need to manipulate the data to transform it into a Yext-ready format. In most cases, a pre-saved configuration will apply these transforms for you.
To leverage these methods, your data must already be stored in these third-party apps, which makes this a very easy way to pull that data in Yext.
The Crawler is a tool that helps scrape web pages for their HTML content. Once a crawl has successfully run on a set of web pages, the Add Data flow can help convert that raw HTML content into entities in the Knowledge Graph.
We will go into more details on how to set up the Crawler in the next unit.
This method is used to pull any relevant data from your website into the Knowledge Graph to either manage the content as entities in the Knowledge Graph, or to surface the data in Answers experiences.
We support two API Sources: Push to API and Pull from API.
Pull from API
This source allows you to pull data from any API and use the Connector to convert that data into entities without having to build and host an integration outside of Yext.
To do this, when you click Add Data you will select Pull from API and specify the details of the API such as the request URL, authentication method, and query parameters.
This method can be used to pull any relevant data from a system you can pull data from via an API. Specifically, to pull data from a public domain at a regular cadence.
Push to API
This option allows you to push data to an endpoint either via a regular API call or a Webhook message.
We will go into more details on how to leverage the API Connector Sources in the next unit.
We are also going to be adding more Data Connectors over time, and if there is a Data Connector that you’d like to see added — let us know in the Community!
This option allows you to write a fully custom Typescript function that can serve as the data source for your Connector.
To learn more about how to add a Function to your account and use it as a Connector data source, visit the Get Started with Functions guide.
Once you select your source you will need to configure your desired source settings. This will look different depending on the source you choose.
For Site Crawler you will need to select the Crawler you want to extract data from, and the desired site extraction settings.
If you choose Pull from API as your source, you will need to enter the GET Request URL, and desired Query Parameters from the API.
Once you have specified the details needed to pull the data in, you will then determine which specific pieces of data you want to pull in.
For the Site Crawler you can use built-in selectors to pull in data like page URL, or you can use CSS or XPath selectors to extract different elements from the site.
For Pull API Connectors you will specify a JMES Path expression to extract a specific element from your API response.
That covers the ‘extract’ part of the process — the next unit will go into how to transform the data that you have extracted from the source before you load it into Yext.