Site Crawler - Crawl from Site Map

Hi Team,

I am helping a client configure a crawler and they would like the source of truth for which URLs get crawled to be their sitemap, e.g. www.brand.com/sitemap.xml, rather than their home page, e.g., www.brand.com, as the crawl will pull Pages that they actually do not want to return in their search experience.

Is there a way to configure a crawler to only crawl URLs from a Site Map?

Thanks!

Juan

Hi Juan,

Unfortunately, we do not yet have Sitemap support within our Crawler. We hope to add this feature in the near future.

Best,
Rachel

Any update on this Rachel? We have a customer who would like to take advantage of this.

Hi @Christopher_Sulham

You can configure the Crawler to ignore/exclude the URLs you do not want to crawl. The platform provides better control and options to achieve the same while add/edit any Crawler.

For help, you can find the detail infromation for the same on Create a Crawler | Hitchhikers

Hope it helps you to achieve the solution.

Hi @Christopher_Sulham,

This is on our roadmap, but has not been released yet. Follow updates on this Ideas board request:

You can upvote the idea to show interest, which will allow you to get notifications when there are updates on it.