Extract — Source, Source Settings & Selectors | Yext Hitchhikers Platform

What You’ll Learn

In this section, you will learn:

An overview of the available Connector sources
How to configure source settings to pull data from the correct endpoint
How to specify selectors to pull in the relevant data components

Overview

The first step to create a connector is selecting your desired data source. Then you will select the entities you want to create or update with this connector.

Data Sources

There are many data sources available in the Connectors framework. These are divided into generic sources (data sources that do not involve third-party apps) and native sources (data sources that connect to third-party apps).

A complete list of the available sources can be found in the Connectors reference section, under Generic Sources and Native Sources. We’ll go over a few of the more common options here.

File Upload

The File Upload connector source allows you to upload Excel, CSV, and JSON files to Yext. Using this method allows you to clean and transform your data using the Connectors framework before importing it to Yext.

FTP/SFTP File Pickup

Similar to the file upload source, the FTP/SFTP source can be used to upload Excel, CSV, and JSON files to Yext. However, this source allows you to retrieve those files directly from where they are stored on your FTP/FTPS/SFTP server.

Crawler

The Crawler connector source scrapes web pages for their HTML content. Once a crawl has successfully run on a set of web pages, the Add Data flow can help convert that raw HTML content into entities in the Knowledge Graph.

Select Site Crawler

This method is used to pull any relevant data from your website into Yext to either manage the content as entities, or to surface the data in Search experiences.

We will go into more details on how to set up the crawler in the next unit.

API

We support two API sources: Push to API and Pull from API.

API Connector Sources

Pull from API

This source allows you to pull data from any API and use the Connector to convert that data into entities without having to build and host an integration outside of Yext.

To do this, when you click Add Data you will select Pull from API and specify the details of the API such as the request URL, authentication method, and query parameters.

This method can be used to pull any relevant data from a system you can pull data from via an API. Specifically, to pull data from a public domain at a regular cadence.

Push to API

This option allows you to push data to an endpoint either via a regular API call or a Webhook message.

We will go into more details on how to leverage the API Connector Sources in the next unit.

Functions

The Function source allows you to write a fully custom Typescript function that can serve as the data source for your connector.

functions source tile

Generic Sources

Generic sources allow you to build a custom connector to retrieve data from a third-party platform (such as Adobe Commerce, Google Business Profile, and others).

Connectors using generic sources are slightly different from the apps in the App Directory. The App Directory contains pre-built connectors that are intended to be used out of the box. If you want a quick-start option, the App Directory may be the righr choice. However, if you want to do any customization of how the connector works with your third-party source, you may prefer to create a connector with a generic source.

Select App Sources

The primary step you will take is to designate the operation you would like to perform, and where we should be pulling the dataset from (e.g., on Zendesk will have the option to ‘Fetch Help Articles’, and you will need to enter your Zendesk subdomain).

Then, you can use selectors to identify exactly which pieces of data you’d like to pull in to your entities in the Knowledge Graph, as well as apply any transforms to your data.

To leverage these methods, your data must already be stored in these third-party apps, which makes this a very easy way to pull that data in Yext.

Source Settings

Once you select your source you will need to configure your desired source settings. This will look different depending on the source you choose.

For Site Crawler you will need to select the Crawler you want to extract data from, and the desired site extraction settings.

If you choose Pull from API as your source, you will need to enter the GET Request URL, and desired Query Parameters from the API.

api settings

Specify Selectors

Once you have specified the details needed to pull the data in, you will then determine which specific pieces of data you want to pull in using selectors.

You can choose to Add Default Selectors and we will pull in all identified selectors. However, you also have the option to make any adjustments to those as needed.

For the Crawler, you can use built-in selectors to pull in data like page URL, or you can use CSS or XPath selectors to extract different elements from the site.

For Pull from API Connectors you will specify a JMES Path expression to extract a specific element from your API response.

edit selector

That covers the ‘extract’ part of the process — the next unit will go into how to transform the data that you have extracted from the source before you load it into Yext.

unit Quiz

+20 points

Daily Quiz Streak: 0

Quiz Accuracy Streak: 0

Question 1 of 3

True or false: In order to create a connector, you must choose a data source.

True

False

Question 2 of 3

True or false: Every data connector uses the same connector source.

True: All connectors extract data the same way

False: There are many different connector sources to extract data from.

Question 3 of 3

What are native sources in Connectors?

Pulling data into a connector via API

Data sources that connect to third-party platforms such as Adobe or Google

The process of uploading a file to Yext

High five! ✋

You've already completed this quiz, so you can't earn more points.You completed this quiz in 1 attempt and earned 0 points! Feel free to review your answers and move on when you're ready.

1^st attempt

0 incorrect

<% elem.innerText %>