Importing from 2 sources using one a the primary and the other as a secondary

Erik_Summerfield · January 9, 2024, 8:26pm

Hello all!

I am looking to do the following
I have 2 content sources Source alpha and source beta the data input fields are the same

{
 id: '40684733'
 phone_number: 'xxx-xxx-xxxx'
 name: 'Bob newhart'
}

All of the records are in source alpha but not all are in source beta. But beta has more up-to-date info so what I want is that after the daily import the yet record phone number field should be comming from source beta if it exists, but fall back to alpha if it does not.

My Idea was to do it with 3 connectors

import from source alpha: imports the phone_number in to a yext field ce_alpha_phonenumber
import from source beta: imports the phone_number in to a yext field ce_beta_phonenumber
Phone number cleanup: is a yext to yext connector that goes from the yext entitiy back to the yext entity, but imports ce_beta_phonenumber if it is not blank and ce_alpha_phonenumber if it is, to thece_phone_number field

Have other solved this problem in better or different manner?

Tosh_B · January 10, 2024, 5:03pm

Hello @Erik_Summerfield

Have you checked with 2 connectors where 2nd connector can run with Default mode.

New entities present in the run will be created.
Existing entities will be updated (or remain unchanged).
No entities will be deleted.

This works for Connectors that expect to ingest new data and update existing data, without the assumption that missing data should be deleted.

So, with first connector the data with phone number will be added. And when 2nd connector will run, it will either update the data or add new data.
With this, can reduce the number of connectors by one.

What you think.!

Rachel_Adler · January 10, 2024, 5:16pm

Hi Erik,

As Tosh_B said, this can be accomplished without the additional connector! However, adding onto the recommendation above, you would also need some logic in Source A connector to ensure that there is not a constant update loop of “Source A Data” → “Source B Data” → “Source A Data” → “Source B Data” etc… on each run.

The way you can do this is using logic in Source A.
Let’s assume that that this is only an issue now for existing entities, since for new entities, initially adding Source A should be fine.
We can first add a Transform “Check Entity Existence” so that we now have a T/F value that says whether the entity exists.

Next, we can add a transform to replace the string with blank (Find and Replace transform) to clear the column, with a CONDITION that the Entity Exists value = true.
We will then make sure that when mapping the column to a field, ClearIfBlank is false (which is it by default). That way, A will not keep overriding the value.

The downfalls with this approach, however, is for the scenario where there is no source data in B, BUT Source Data A for that field updates. So if there are any changes that might be made in A that need to be respected, we would probably need to rethink how this works since we could find ourselves in another loop of updates depending on how we implement a solution.

Any additional context on what fields they are and when they might be updated can help us guide you to a solution! Let us know if you have further questions, as well.

Best,
Rachel

Erik_Summerfield · January 10, 2024, 8:08pm

@Rachel_Adler,
I do think we will have the issue where there are items in source A that are not is source B, that will get updated. And we will want the fields yext record to update.

Do you see a way to make sure that occurs?

Rachel_Adler · January 10, 2024, 8:14pm

Yes! Ok so maybe for each field in Source B, you could have a separate field that says “From Source B”, and the value can be set to True (using an Add Column transform) with the condition that that it’s true if the value in source B is not blank.

Then, when you run Source A, add a transform to clear the column value if “From Source B” = true.
The only nuance here is you would need to use a function transform to get the Entity Data from the Entities API for that field (Since it’s not in the source). That should be a pretty easy function to write though, and we have an example here!

Erik_Summerfield · January 11, 2024, 5:09pm

@Rachel_Adler

is there any concern about a performance hit here as this is going to happen on each row? or is that query fast enough I should not worry about it?

Rachel_Adler · January 11, 2024, 5:54pm

This should be fine! I would not worry about performance.

Erik_Summerfield · February 1, 2024, 6:54pm

@Rachel_Adler
Please not that your example does not use the correct API url

Erik_Summerfield · February 1, 2024, 8:05pm

@Rachel_Adler Also
We now have it set up to run but each dry run is stalling at about 69%. Any idea why this might be I can not get any error so far and when I cancel it it still does not have any errors.

Erik_Summerfield · February 1, 2024, 8:40pm

After a few hours it did move on and I am starting to see results now, but this does seem like a very long run before I introduced the function it ran very quickly

Topic		Replies	Views
New Data Connector: API (Summer '21 Release) Summer '21 Release	0	2119	July 24, 2021
How to Create Multiple Data Connectors Using the Same Source Content	0	974	October 13, 2021
Data Connector - Entity Import Management Content	1	744	June 29, 2021
Plugin Example for Converting XML to JSON	2	2139	January 24, 2024
Question on Legacy Get Schema Endpoint Content	3	1306	February 1, 2024

Importing from 2 sources using one a the primary and the other as a secondary

Related topics