Who to Extract Images from a webpage with the Answer Crawler

Justin_Liles · February 4, 2022, 9:01pm

Is there a way to bring in images from a URL crawl? I’m able to bring in the page id, URL, title, and cleaned body content but not the images
Example page: Example page here ← Looking to bring to classify and bring the images into the crawler selector.

DJ_Corbett · February 18, 2022, 5:03pm

Hello Justin,

Yes, you can bring in images using a crawler-based connector! Below is an example of how you could extract all of the images from the example page you provided and put them in a Photo Gallery field.

On your example page, I noticed that all of the images within the article follow a very similar structure when it comes to the underlying HTML and class names. Based on that example page, I set up a selector in my crawler that targets all of the images, which looks like this:

That CSS path is targeting the images by looking at each list item’s <li> tag (all of the numbered steps are list items), finding steps that have a <div> tag with class names of “intercom-container” and “intercome-align-center” and then extracting the image urls from the <img> tags within those containers.

Then, I can map that list of image URLs to my Article Gallery field like so:

After running the connector, my Article Gallery field now has all of the images from the article!

Please let me know if you have any questions about this!
DJ

Justin_Liles · February 19, 2022, 12:15am

Thank you for the detailed workflow on how to accomplish this. Exactly what I needed.

Topic		Replies	Views
How does the "List Page" functionality work in a Crawler Connector?	2	1003	July 23, 2021
Crawler - Help Targeting Correct CSS Class Content spring21-release	2	986	June 2, 2021
Crawler Connector: How to retrieve attribute value in Entity Container? Content	2	502	March 15, 2022
Https://hitchhikers.yext.com/tracks/knowledge-graph/kg140-data-connectors/assessment/ Content	1	750	May 13, 2021
Formatted content via a data connector Search	1	540	October 27, 2021

Who to Extract Images from a webpage with the Answer Crawler

Related topics