I am helping a client configure a crawler and they would like the source of truth for which URLs get crawled to be their sitemap, e.g. www.brand.com/sitemap.xml, rather than their home page, e.g., www.brand.com, as the crawl will pull Pages that they actually do not want to return in their search experience.
Is there a way to configure a crawler to only crawl URLs from a Site Map?
You can configure the Crawler to ignore/exclude the URLs you do not want to crawl. The platform provides better control and options to achieve the same while add/edit any Crawler.