I’m trying to crawl a website that has 2 language versions.
The default one, in English, doesn’t include any slug/path for the content but the one in Spanish is all below the slug/path /es/
https: //mydomain. com/products/whatever (English)
https: //mydomain. com/es/productos/whatever (Spanish)
As I want to create multi language profiles I’m trying to create different Crawlers for the different language versions of the site. I see that there’s this option of Blacklisted URLs when creating the Crawler. However, the options I’ve used:
- Whole domain with slug and wildcard https: //mydomain. com/es/*
- Regexp of relative path
Didn’t work for me.
It’s a bit confusing cause the tooltip (that states a regex should be used) doesn’t match the placeholder (where a full URL is used).
Could you please let me know exactly what I should enter in the Blacklisted URLs field to filter the pages that include the path /es?