XPath Selector - Selecting Only Text in an Element and Not Text in Child Elements

Hello!
I’m setting up a data connector from a crawler and I’m struggling to use the XPath selector to isolate a bit of text. After much trial with XPath selectors, I can’t seem to only get the text “wantThis” because the XPath selector always includes “dontWantThis1” and “dontWantThis2”. Is there anyway to only select “wantThis”?
The XPath selector I’m currently using is: //ul[@class=“ingredients-list”]/li/label/span[@class=“ingredient-product-wrap”]

<label>
<input type="checkbox" id="ingredient-60da21613d856" class="fa ingredient-checkbox">
<span class="ingredient-product-wrap" itemprop="recipeIngredient">
<span class="imperial">dontWantThis1 </span>
<span class="metric hidden">dontWantThis2 </span>
wantThis</span>
</label>

Thanks!

I actually figured this out, I used the code below.

//ul/li/label/span/text()[last()]

This selects only the text in the parent and not the text in the children 's.

Hi Enoch,

Awesome! Yes, that would definitely work! Impressive use of XPath.

You could also consider using a the “Direct Text” Extract Settings instead of the default “Text” setting.

The difference is that the “Text” setting will extract all text contained within the element you specify, including the text of any of that element’s children (e.g. dontWantThis1, dontWantThis2 in addition to wantThis), whereas the “Direct Text” setting will only extract the text directly within the element you specified and not its children (e.g. just wantThis).

Hope that helps!
Jamie

1 Like

Hi Jamie,

That is a great solution! Thanks so much for making me aware of this!

Enoch