I’m trying to pull provider information for a new entity into the KG from a website via the crawler + data connector. I’ve set up the majority of the data connectors I need, but I’m struggling to get the address data set up correctly. When I use a CSS or XPath selector, the address data gets pulled in as “City State Zip” in one text string, and the data connector can’t parse this into the respective fields in the entity. How do I pull in just a single set of text (one line within a span) with CSS or XPath?
URL with the address info I’m trying to pull: https://www.cityofhope.org/farah-abdulla
Address line I’m working on: Duarte, CA 91010
CSS selector I’ve tried:
tab0-0 > div > div.bio_location.col > div > div > div.loc-item-content > div.loc-item-address > span:nth-child(5)
XPath selector I’ve tried (isolated to just “Duarte”):
//*[@id="tab0-0"]/div/div[3]/div/div/div[2]/div[2]/span[4]/text()[1]
Another detail worth noting: the address field is required in order to add this entity type (HC professional) to this KG. So without this connector set up correctly, I can’t use the crawler + data connector flow for the use case I’m trying to solve for.
Thank you!