Crawler Connector: How to retrieve attribute value in Entity Container?

Hi Team!

I have one page crawled as follows,

<a class="item" href="sample-1.html">
    <img src="assets/a.jpg" alt>
</a>
<a class="item" href="sample-2.html">
    <img src="assets/b.jpg" alt>
</a>
<a class="item" href="sample-3.html">
    <img src="assets/c.jpg" alt>
</a>

In this case, in crawler connector setting, I have selected “List Page” and “a.item” as an Entity Container.

I could retrieve image url under each anchor tag.
How can I get attributes value such as sample-1.html in anchor tag?

Thank you!
-YY

Hey YY,

Once you get to the “Specify Selectors” page, you can get that href attribute value by using a.item as the CSS Path, and choosing “URL” in the Extract Settings.

Best,
DJ

1 Like

Thank you, @DJ_Corbett !
I have tried it. However, it doesn’t work. Inner items under a.item can be retrieved but a.item cannot be pulled with any its values. I think I can prepare sample page. I will update you later.

-YY