Who to Extract Images from a webpage with the Answer Crawler

Is there a way to bring in images from a URL crawl? I’m able to bring in the page id, URL, title, and cleaned body content but not the images
Example page: Example page here ← Looking to bring to classify and bring the images into the crawler selector.

Hello Justin,

Yes, you can bring in images using a crawler-based connector! Below is an example of how you could extract all of the images from the example page you provided and put them in a Photo Gallery field.

On your example page, I noticed that all of the images within the article follow a very similar structure when it comes to the underlying HTML and class names. Based on that example page, I set up a selector in my crawler that targets all of the images, which looks like this:

That CSS path is targeting the images by looking at each list item’s <li> tag (all of the numbered steps are list items), finding steps that have a <div> tag with class names of “intercom-container” and “intercome-align-center” and then extracting the image urls from the <img> tags within those containers.

Then, I can map that list of image URLs to my Article Gallery field like so:

After running the connector, my Article Gallery field now has all of the images from the article!

Please let me know if you have any questions about this!
DJ

Thank you for the detailed workflow on how to accomplish this. Exactly what I needed.