Step 2: Collect Data and Set Up Knowledge Graph

Now that you’ve determined your data strategy and scope, including what will actually be searched and displayed in your experience, you’ll need to collect that data and add it to the Yext platform.

As mentioned in the previous step, the quality of your data will impact the quality of your Search experience. When collecting data, you’ll want to make sure you consider your data sources:

Where will the data come from?
How can you pull that data and ingest it into Yext?
How can you keep that content up-to-date? Be careful not to mislead your users with stale data or introduce manual processes to maintain your data.

Review the track for a more conceptual understanding of storing content in the Yext Knowledge Graph. In the context of Search, you will need to:

Set up the structure for your Knowledge Graph based on what data you want to be searched and how, plus what data you want displayed in the frontend
- If the relevant entity type already exists, enable the entity types . If not, create new entity types for each type of content.
- Create the relevant fields . Include the following custom fields:
  - “Active in Search” (Type = Yes/No)
    - Use this field in saved filter configurations in the Search backend to specify which entities can be returned in a vertical, i.e. add a saved filter criteria for “Active in Search = Yes”. This way you could store drafts of entities or remove entities from Search without removing them from the platform.
  - “Primary CTA” (Type = Call to Action)
    - Use the CTA fields to specify calls to action in the frontend on each entity result card.
  - “Secondary CTA” (Type = Call to Action)
  - Any other relevant fields
Ingest data into the Yext platform. The data should be provided by the organization the Search experience is for. The ways you can add data are:
- Build a data connector with an optional crawler .
  - If you are using a crawler, you’ll want to be sure you’re crawling a structured site to reduce manual adjustment, errors, and poor data, both in the initial crawl and when new content is introduced.
    - Are we able to identify the URLs to crawl for each vertical? If yes, are the URL structures comprehensive?
    - Can the pages be mapped using a unique data mapping? If not, can you map another page component (URL, H1, meta, breadcrumbs, etc.)?
  - Once you set up the crawler, be sure to set up the corresponding connector.
  - Make sure that the Yext Crawler is properly whitelisted to access your web pages. Whitelist our Crawler’s user agent and IP addresses .
  - Ensure the Connector maps to and collects all fields and metadata elements (e.g., Date Posted, Author, etc. for a document) necessary to power your Search experience.
- Use an existing app or build an integration with a third-party where you already store your content.
- Upload a data file (CSV or XLSX).
Create saved filters to restrict entities in Search to only those that have been fully vetted and approved. We recommend creating a saved filter for each vertical by setting the following criteria:
- Active on Search = Yes
- Entity Type = the entity types you’re using for that vertical
- Fields with Data include Website URL or Primary Image (if you would like to ensure only entities with populated data for these fields appear in Search)
- Any other relevant criteria
Audit your data to ensure you have collected and added the relevant data into the Yext platform:
- Have you populated entities for each vertical?
- Is there structured information for any needed sorts, filters, facets, etc.?
- Is there structured content for frontend cards (e.g., images)?
- Have you added any information that will be needed for query rules/arbitrary business logic?

<% elem.innerText %>