Pages Data Dictionary | Yext Hitchhikers Platform

Use these views to track the performance of every build across all of your Yext-owned sites. You can see which builds were successful and which ones weren’t, a log of generation function invocations, metadata for all of your sites, and more.

pages data sharing table diagram

Builds

pages.builds

This view contains a log of your YextCI builds.

Column Data Type Description
business_id NUMBER The numeric identifier for the account associated with the build
build_id NUMBER The numeric identifier for a build
site_id NUMBER The numeric identifier for the Yext site associated with the build
status VARCHAR(128) The status of the build, i.e. SUCCESS, SYSTEM_FAILURE, BUILD_FAILURE, etc.
artifact_size NUMBER The size, in bytes, of artifacts extracted and uploaded by the build
queued_at_timestamp TIMESTAMP_NTZ NOT NULL The timestamp at which the build was enqueued in the system
started_at_timestamp TIMESTAMP_NTZ NOT NULL The timestamp at which the build started execution
completion_timestamp TIMESTAMP_NTZ NOT NULL The timestamp at which the build finished execution
build_duration_seconds NUMBER How long the build took from start to finish


Sample Queries

Calculate the average build time for a given site, excluding failed builds.

select round(avg(build_duration_seconds), 2) -- Optionally round the build time to nearest 100th of a second
from prod_pages.public.builds
where status = 'SUCCESS'
and site_id = 12345

and build_duration_seconds is not null;

Calculate the average build size for all sites, by build status.

select
    status,
    avg(artifact_size) 
from prod_pages.public.builds
where status is not null
and artifact_size is not null;

Get a log of all builds in the past year.

select 
    build_id,
    site_id,
    status,
    artifact_size,
    queued_at_timestamp, 
    started_at_timestamp,
    completion_timestamp,
    build_duration_seconds
from prod_sharing.pages.builds


Generations

pages.generations

This view contains a log of generations through one of the Pages producers.

Column Data Type Description
business_id NUMBER The numeric identifier for the account associated with the generation
site_id NUMBER The numeric identifier for the Yext site associated with the generation
producer_type VARCHAR(60) The Pages system used in the generation. Can be YEXT_SITES, CLASSIC_PAGES, or LANDING_PAGES
deploy_id VARCHAR(16) The numeric identifier for a Yext Sites deploy. Returns NULL if producer type is not Yext Sites
generation_type VARCHAR(60) Whether the generation was an incremental update or a full publish
is_production BOOLEAN Whether the generation was for the deployment that is set as the production version of the site.
pages_generated NUMBER The number of pages published by the generation
status VARCHAR(60) The status of the update
duration_seconds NUMBER How long it took for the generation to complete
completed_timestamp TIMESTAMP_NTZ IS NOT NULL The completion timestamp of the generation
request_timestamp TIMESTAMP_NTZ IS NOT NULL The time of the source event that yielded the generation, i.e. an update from streams or a publish kicked off
build_id NUMBER The numeric identifier for the YextCI build with which the generation is associated. Only applicable for Classic Pages and Yext Sites
lp_generation_id VARCHAR(26) The generation result ID for Landing Pages
cp_generation_id VARCHAR(26) The generation result ID for Classic Pages
activity_id VARCHAR(26) The activity ID for Yext Sites
start_timestamp TIMESTAMP_NTZ The timestamp when the generation itself started processing, not to be confused with request_timestamp

Sample Queries

Get the count of successful generations for a given site’s production version label in the past 90 days, updates only, by site ID.

select 
    site_id, 
    count(*) as generations
from prod_sharing.pages.generations
where completed_timestamp > dateadd(day, -90, current_date)
and generation_type = 'UPDATE'
and is_production
and status = 'SUCCESS'
group by 1 order by 2 desc;

Get a log of failed generations for all sites, all versions in the past year

select 
    businesses.business_id,
    name,
    sites.site_id,
    site_name,
    deploy_id,
    generation_type,
    is_production,
    pages_generated,
    duration_seconds, 
    start_timestamp,
    build_id
from prod_sharing.pages.generations
join prod_sharing.public.businesses using (business_id)
join prod_sharing.pages.sites using (site_id)
where completed_timestamp > dateadd(day, -365, current_date)
and status = 'FAILURE';


Generation Function Invocations

pages.generation_function_invocations

This view contains a log of function invocations for a given activity in the Pages system.

Column Data Type Description
business_id `NUMBER The numeric identifier for the account associated with the generation function invocation
correlation_id VARCHAR(26) The alphanumeric identifier for the activity that triggered the generation function invocation.
event_type_name VARCHAR(64) The name of the event that resulted in the generation function invocation.
invocation_id VARCHAR(64) The identifier for the generation function invocation
timestamp TIMESTAMP_NTZ The timestamp of the generation function invocation

Sample Queries

Get the count of generation function invocations by event type

select 
    distinct event_type_name, 
    count(*) as function_invocations
from prod_sharing.pages.generation_function_invocations
group by 1 order by 2 desc;

Get a log of every onUrlChange invocation type from the past 90 days

select * 
from prod_sharing.pages.generation_function_invocations
where event_type_name = 'ON_URL_CHANGE'
and timestamp > dateadd(day, -90, current_date);

Get a log of every failed onPageGenerate invocation type from the past 90 days. This involves a join with the function_invocations table using the invocation_id column.

select 
    invocation_id,
    business_id,
    plugin_external_id,
    function_name,
    argument_json,
    error
from prod_sharing.pages.generation_function_invocations
join prod_sharing.public.function_invocations using (invocation_id)
where event_type_name = 'ON_PAGE_GENERATE'
and error is not null
and timestamp > dateadd(day, -90, current_date)


Sites

pages.sites

This view contains information about a specific page generated and hosted by the Pages system.

Column Data Type Description
business_id NUMBER The numeric identifier for the account associated with the site.
site_id NUMBER The numeric identifier for the Yext site.
hostname VARCHAR(255) The public identifier for the site. This is NULL if a domain has not been set up.
site_name VARCHAR(255) The name of the site, for internal storage.
classic_site_id NUMBER The old identifier for the Yext site in the Classic sites system. Sites that are Landing Pages will not have a classic site ID, but Classic Pages and Pages will have a classic site ID.
display_url VARCHAR(255) The reverse proxy hostname, if configured. Otherwise this is the hostname.
published_deploy_id VARCHAR(64) The unique numeric identifier for the current published deploy.
live_edition_id NUMBER The unique numeric identifier for the live (production) edition of the site.
type VARCHAR(255) The type of the site, e.g., YEXT_SITES or CLASSIC_PAGES.
backup_deploy_id VARCHAR(64) The unique numeric identifier of the backup deploy.
repo_uuid VARCHAR(36) The unique alphanumeric identifier for the site’s linked repository.
creation_timestamp TIMESTAMP_NTZ The timestamp of when the site was first created.
timestamp TIMESTAMP_NTZ The timestamp of the most recent insert or update associated with the site.

Sample Queries

Get metadata for each site created in the past six months

select 
    sites.creation_timestamp, 
    site_name,
    concat('https://yext.com/s/', businesses.business_id, '/yextsites/', site_id) as site_link,
    businesses.business_id, 
    name as account_name, 
    site_id, 
    sites.repo_uuid, 
    hostname, 
    published_deploy_id
from prod_pages.public.sites
join prod_platform.public.businesses using (business_id)
where creation_timestamp > dateadd(month, -6, current_date)

order by 1 desc;


Pages Errors

pages.pages_errors

This view contains a log of errors that occurred during a publish in the Pages system.

Column Data Type Description
business_id NUMBER The numeric identifier for the account to which the error belongs
pages_error_id VARCHAR(26) The unique identifier for the error
site_id NUMBER The numeric identifier for the Yext site associated with the error
deployment_id VARCHAR(16) The numeric identifier for the deploy associated with the error
activity_id VARCHAR(26) The numeric identifier for the activity the error is for
entity_id NUMBER The numeric identifier for the entity associated with the error
error_type VARCHAR(128) The type of error encountered
error OBJECT A JSON representation of the error details
timestamp TIMESTAMP_NTZ The timestamp of when the error occurred

Sample Queries

Get the count of all errors from today.

select count(*)
from prod_sharing.pages.pages_errors 
join prod_pages.public.sites using (site_id)
where date(pages_errors.timestamp) = current_date();

Get a log of all errors from the past year.

select 
    timestamp,
    name,
    concat('https://yext.com/s/', businesses.business_id, '/yextsites/', pages_errors.site_id) as site_link,
    businesses.business_id,
    site_name,
    pages_errors.site_id,
    error_type,
    repo_uuid
from prod_sharing.pages.pages_errors 
join prod_platform.public.businesses using (business_id)
join prod_sharing.pages.sites using (site_id)
group by
    1,2,3,4,5,6,7,8
order by 1 desc;
Feedback