Module Assessment | Yext Hitchhikers Platform

loading
You must include a first name and last name in your profile to create challenge environments.

Background

The Turtlehead Tacos team has a help site to answer questions about online ordering and using their mobile app. Eventually, they want to surface answers to these support questions in a Search experience that they’ll build with Yext.

You know that in order for them to surface questions in a Yext Search experience, you’ll first need to store that data in the Knowledge Graph. The best way to do this is to create Help Article entities in the Knowledge Graph from their help site content.

In this challenge, you will:

  • Enable the Help Article entity type in the Knowledge Graph
  • Create a crawler to scrape the HTML data from their help site
  • Build a connector using the crawler as a data source to convert the crawled page data into Help Article entities

You will need to use these resources to help you complete the challenge:

  • Create a Crawler Connector guide — step-by-step instructions on how to create a crawler and build a connector with the crawler as a data source. Disregard Step 1 in the guide (whitelisting URLs); this doesn’t apply to challenge accounts. Don’t worry, the instructions in this challenge will also help you every step of the way!
  • The Turtlehead Tacos help site — these are the help articles you’ll be working with.

Your Challenge

  1. First, enable the Help Article entity type in the Knowledge Graph. Navigate to Knowledge Graph > Configuration and click the Entity Types tile. Find and enable the Help Article entity type.

  2. Create a crawler to extract the HTML data from the Turtlehead Tacos help site. Navigate to Knowledge Graph > Configuration and click the Crawlers tile (under the Data Ingestion and Processing section).

  3. Click New Crawler and fill out the crawler settings as shown below:

    • Crawler Name: Help Articles
    • Schedule: Once
    • Source Type: Domains
    • Crawl Strategy: Sub-Pages
    • File Types: This is up to you. The site only contains HTML files, so you can either leave this setting on “All File Types”, or you can select to only crawl HTML files.
    • Pages or Domains to Crawl: https://help.turtleheadtacos.com (make sure not to add a trailing slash / at the end of the URL!)
  4. Leave all other settings on their default values. Click Save Crawler at the bottom of the screen.

  5. You should see a crawl in progress. When completed, you should see that six pages have been crawled. This may take a moment, and you might need to refresh the page.

  6. Now, build a connector to pull in the crawled HTML data and use it to create Help Article entities in the Knowledge Graph. When your crawl is complete, navigate to Knowledge Graph > Connectors and click Add Connector in the upper right.

  7. Select Site Crawler as your data source.

  8. Set the following Crawler Settings:

    • Crawler: Help Articles (the Crawler you set up in Steps 2-4)
    • File Type: HTML
    • URLs: Choose Specific URLs or URL Patterns and enter this URL pattern: https://help.turtleheadtacos.com/* (this will pull in all the individual help articles, but not the help site homepage.)
  9. Click Continue.

  10. Set the Page Type to Detail Page (because each crawled page should be created as one entity). Click Continue.

  11. On the Selectors step, click Add Default Selectors. You’ll see the Page ID, Page URL, and Page Title data pulled in.

  12. Click Add Selector at the top of the screen and add two new selectors: one to pull in the body content of each help article, and one to pull in the content tags at the bottom of each article (use this article as an example).

    • Header: “Body”
    • Specified Path: select Cleaned Body Content

    • Header: “Tags”

    • Specified Path: select CSS. In the text box, enter the proper CSS selector to pull in the content tags. See below for more on how to find it.

    • Extract Settings: Text

  13. To find the CSS selector, right-click to inspect the page and locate the selector that contains the list of content tags.

    • You can also use jsoup to try to find the right selector — copy and paste the HTML content of the page from the Inspect window in your browser.
    • If you can’t figure it out, the CSS selector to use is: .lp-param-uu5aBi_cjR-textList > div > li
  14. Select Knowledge Graph as the destination. Click Continue.

  15. Select Help Article as the Entity Type. Click Continue.

  16. In the Map Fields step, map the data columns to fields on the Help Article entity type. In the Column Header column, you’ll see the selectors you added. In the Sample Data column, you’ll see a preview of the data the connector pulled in for that column. In the Map to Field column, select the appropriate entity fields to map each column as shown below:

| Column Header | Map to Field | | ————- | —–:| | Page ID | Entity ID | | Page URL | Landing Page URL | | Page Title | Name | | Body | Body (with Markdown subfield) | | Tags | Keywords (mapped to an entire list with a comma , as the delimiter) |
  1. Click Save at the bottom of the page. Enter the following Name and ID for the connector:

    • Name: Help Articles
    • ID: helpArticles
  2. Click Save & Run Now to pull the entities into your account. Run in Default Mode.

  3. Monitor your run to make sure it is successful. You should see five new Help Article entities in the Knowledge Graph.

Module Assessment
+<% util.points %> points
loading
Weekly Challenge Streak Weekly Challenge Streak: <% util.streakWeekly %>
You must include a first name and last name in your profile to create challenge environments.
Challenge account expires in <% util.expirationHours > 24 ? Math.round((util.expirationHours * 1.0) / 24) : util.expirationHours %> <% util.expirationHours > 24 ? (Math.round((util.expirationHours * 1.0) / 24) == 1 ? 'day' : 'days') : (util.expirationHours == 1 ? 'hour' : 'hours') %>.
Challenge account has expired. Please create a new account.
Report Card
Enable the Help Article entity type
Create a crawler for help.turtleheadtacos.com
Create a crawler connector for help articles built on top of your web crawler
Run your connector to pull in the help articles
Previous Submissions
Attempt #<% submission.attemptNumber %>
<% submission.date %>
Score: <% submission.numericScore %>
Pending