Connector

A resource that represents the configuration of a Connector.

$idstring Required

$schemaconst Required

displayNamestring Required

Display name of this Connector.

entityTypestring Required

The entity type which data can be ingested for.

sourceobject Required

Represents a combination of the configuration to fetch data & selectors to extract content from the raw data.

Required

Configuration for various sources of data.

Type: object

The following properties are required:

  • crawlerConfig
Type: object

The following properties are required:

  • apiPullConfig
Type: object

The following properties are required:

  • apiPushConfig
Type: object

The following properties are required:

  • functionConfig

object

Configuration used to extract data from the crawler.

string Required

The reference to the crawler resource.

array of string

List of URL patterns to extract data for.

Each item of this array must be:

Type: string

The URL to extract data for.

object

Configuration used to connect and pull data from an API.

enum (of string) Required

The request method type.

Must be one of:

  • “GET”

string Required

The request URL to use.

The authentication mechanism to use to connect to the API.

Type: object

The following properties are required:

  • bearerToken
Type: object

The following properties are required:

  • basicAuthentication
Type: object

The following properties are required:

  • apiKey

string

A token that will be passed in the authorization header.

object

Username and password authentication.

string Required

The username.

string

The password.

object

An API key will be used to authenticate requests. The API-key can be appended in the header or as a query param depending on the source API.

string Required

The key used to send the token.

string Required

The API token content.

boolean Required

Indicates if the token should be set as a header instead of a query param.

The method used for controlling pagination.

Type: object

The following properties are required:

  • pageBasedPagination
Type: object

The following properties are required:

  • cursorPagination
Type: object

The following properties are required:

  • linkHeaderPagination
Type: object

The following properties are required:

  • offsetPagination

number

Max pages to fetch. If unset, all pages will be fetched.

object

Page-based pagination will increment the page key query parameter value until all pages are returned.

string Required

Pagination key to be passed as a query parameter in the request.

number

Initial value for the page key. If specified, this value will be used as the query parameter value for the Page Key in the first request. If not specified, 0 will be used by default.

object

Points to the total number of pages either in the response body or the response headers.

string Required

The key for the total number of pages. A JMES path expression.

boolean

Default: false

Indicates if the total number pages value is located in the response headers or the response body.

string

Key to specify the max number of entries returned per page, passed as a query parameter in the request.

number

Value for the limit key which specifies the max number of entries returned per page.

object

Cursor-based pagination will look for a cursor in each response and pass it to the subsequent request to fetch the next page.

string

Pagination key to be passed as a query parameter in the request.

string Required

The key that contains the cursor in the response. If detectCursorInHeader is set to true, the key will be searched for in the header. If not set or set to false, the key will be searched for in the response body.

enum (of string) Required

Indicates if the cursor is a Token, Relative URL or Full URL.

Must be one of:

  • “TOKEN”
  • “FULL_URL”
  • “RELATIVE_URL”

boolean

Indicates if the cursor key will be contained in the response header or the response body.

object

Link header pagination will use a link with a specified relation provided in the response header to fetch the next page.

string Required

The label of the link that should be used. In most cases, this will be “next”.

enum (of string)

Default: “FULL_URL”

Indicates if the link is a full URL or a relative URL.

Must be one of:

  • “FULL_URL”
  • “RELATIVE_URL”

object

Offset pagination will use the offset and limit query parameters to paginate through all the items in a collection.

Must not be:

Type: object

The following properties are required:

  • totalPages
  • totalItems

string Required

Offset key to be passed as a query parameter in the request.

string Required

Key to specify the max number of entries returned per page, passed as a query parameter in the request.

number

Default: 0

Value for the offset to be used in the initial request.

number

Value for the limit which specifies the max number of entries returned per page.

object

Points to the total number of pages either in the response body or the response headers.

string Required

The key for the total number of pages. A JMES path expression.

boolean

Default: false

Indicates if the total number of pages value is located in the response headers or the response body.

object

Points to the total number of items either in the response body or the response headers.

string Required

The key for the total number of items. A JMES path expression.

boolean

Default: false

Indicates if the total number of items value is located in the response headers or the response body.

object

The maximum requests that can be made in a specified unit of time.

enum (of string) Required

The unit of time per which the specified quantity of requests can be sent.

Must be one of:

  • “SECOND”
  • “MINUTE”
  • “HOUR”

number Required

The maximum number of requests that may be sent in the specified unit of time.

object

Map of header keys and values to use in the API request.

string Pattern Property

All property whose name matches the following regular expression must respect the following conditions

Property name regular expression: ^.
$

object

Map of query parameter keys and values to use in the API request.

string Pattern Property

All property whose name matches the following regular expression must respect the following conditions

Property name regular expression: ^.
$

object Required

The data format of the response.

Must be one of:

  • “JSON”

object

Configuration used to connect to an app and receive data pushed to the Connectors API endpoint.

string

The reference to the app resource.

object Required

boolean

If enabled, a new run will be initiated every time a request is sent to this Connector and the data provided in the request will be processed.

object

Configuration used to invoke a function.

string Required

The reference to a Plugin resource that contains a desired function.

string Required

The function to invoke.

object Required

object

A selector for extracting content.

enum (of string) Required

The type of data selector.

Must be one of:

  • “CSS”
  • “XPATH”
  • “JSON”
  • “PAGE_URL”
  • “PAGE_TITLE”
  • “CLEANED_BODY”
  • “PAGE_ID”
  • “ITEM_ID”

string Required

The header used to identify the extracted content.

string

The selector content path.

enum (of string)

The selector mode of a CSS or XPath selector.

Must be one of:

  • “ALL_TEXT”
  • “DIRECT_TEXT”
  • “INNER_HTML”
  • “URL”
  • “IMAGE_URL”
  • “ATTRIBUTE”

string

The attribute key of a CSS or XPath selector.

array

An ordered list of selectors to apply and extract data.

Each item of this array must be:

A selector for extracting content.

Same definition as source_baseSelector

transformsarray of object

Transforms to sequentially apply to data produced by selectors.

Each item of this array must be:

A transform to apply to the data output by the previous step.

Type: object

The following properties are required:

  • fixCapitalization
Type: object

The following properties are required:

  • removeUnwantedChars
Type: object

The following properties are required:

  • extractText
Type: object

The following properties are required:

  • function
Type: object

The following properties are required:

  • findAndReplace
Type: object

The following properties are required:

  • addColumn
Type: object

The following properties are required:

  • filterRows
Type: object

The following properties are required:

  • formatDates
Type: object

The following properties are required:

  • mergeColumns
Type: object

The following properties are required:

  • splitColumns
Type: object

The following properties are required:

  • fillInEmptyCells
Type: object

The following properties are required:

  • splitIntoRows

object

Transforms the text of the input columns by applying the selected capitalization option.

object Required

Specifies which columns should have their values transformed.

array of string

The names of the columns containing values to transform.

Each item of this array must be:

Type: string

The name of the column containing values to transform.

const

True if all columns should be transformed, including any added in the future.

Specific value: true

enum (of string) Required

The clean option to be applied.

Must be one of:

  • “ALL_CAPS”
  • “ALL_LOWER”
  • “PROPER_CASE”

object

Transforms the text of the input columns by applying the selected character removal options.

object Required

Specifies which columns should have their values transformed.

Same definition as transforms_items_fixCapitalization_inputHeaders

array of enum (of string) Required

How the data should be cleaned.

Each item of this array must be:

Type: enum (of string)

The unwanted character removal options to be applied.

Must be one of:

  • “TRIM_WHITESPACE”
  • “REMOVE_WHITESPACE”
  • “REMOVE_NUMBERS”
  • “REMOVE_NON_NUMERICS”
  • “REMOVE_PUNCTUATION”

object

Transforms the text of the input column by extracting text and creating a new column.

Type: object

If the conditions in the “If” tab are respected, then the conditions in the “Then” tab should be respected. Otherwise, the conditions in the “Else” tab should be respected.

Type: object

object

Must match regular expression: [_INSTANCE_MATCHING_TEXT]
Type: object

The following properties are required:

  • valueToFind
Type: object

If the conditions in the “If” tab are respected, then the conditions in the “Then” tab should be respected. Otherwise, the conditions in the “Else” tab should be respected.

Type: object

object

Must match regular expression: [OFFSETFROM]
Type: object

The following properties are required:

  • offsetLength
Type: object

If the conditions in the “If” tab are respected, then the conditions in the “Then” tab should be respected. Otherwise, the conditions in the “Else” tab should be respected.

Type: object

object

Must match regular expression: [SOMETEXT*]
Type: object

The following properties are required:

  • maxLengthTextToKeep

string Required

The header of the column from which text should be extracted

string Required

The header for the new column that will be populated with extracted data

enum (of string) Required

The strategy to be used when determining how much of the text to extract.

Must be one of:

  • “ALL_TEXT_AFTER”
  • “ALL_TEXT_BEFORE”
  • “SOME_TEXT_AFTER”
  • “SOME_TEXT_BEFORE”

enum (of string) Required

From where the extraction operation should occur

Must be one of:

  • “FIRST_INSTANCE_MATCHING_TEXT”
  • “LAST_INSTANCE_MATCHING_TEXT”
  • “OFFSET_FROM_BEGINNING”
  • “OFFSET_FROM_END”

object

The pattern or text to be replaced.

object

A literal text value to be matched. If the value is the empty string, only empty cells in the input columns will be matched.

string Required

Text to be matched.

boolean

Default: false

True if matching should be case-insensitive, false otherwise.

string

Regular expression to be matched (in accordance with the java.util.regex engine).

number

How many characters from the starting point that the extraction should begin.

number

The max number of characters that should be extracted. No limit if 0 or unspecified.

object

Invokes a function to transform values.

object Required

Specifies which columns should have their values transformed.

Same definition as transforms_items_fixCapitalization_inputHeaders

string Required

The reference to a Plugin resource that contains a desired function.

string Required

The function to invoke.

object

Finds a specified pattern or text and replaces it with a specified value.

object Required

Specifies which columns should have their values transformed.

Same definition as transforms_items_fixCapitalization_inputHeaders

object Required

string Required

The text that will replace all found matches. If the replacement value is the empty string, the transform will clear all values that match the valueToFind.

object

Adds a new column and populate the column with a static value

string Required

The new column’s header.

string Required

The column value to be added.

object

Filters rows based on specified conditionals.

enum (of string) Required

The action to perform on rows that satisfy the evaluation of rules.

Must be one of:

  • “KEEP”
  • “REMOVE”

enum (of string) Required

The combinator connecting all rules.

Must be one of:

  • “OR”
  • “AND”

array Required

A list of rules to be applied to filter rows.

Must contain a minimum of 1 items

Each item of this array must be:

A single rule to be applied to filter row.

Type: object

If the conditions in the “If” tab are respected, then the conditions in the “Then” tab should be respected. Otherwise, the conditions in the “Else” tab should be respected.

Type: object

const

Specific value: “IS_BLANK”
Type: object

The following properties are required:

  • conditionalInput
Type: object

If the conditions in the “If” tab are respected, then the conditions in the “Then” tab should be respected. Otherwise, the conditions in the “Else” tab should be respected.

Type: object

const

Specific value: “IS_NOT_BLANK”
Type: object

The following properties are required:

  • conditionalInput

string Required

The column header to which to apply the rule.

enum (of string) Required

The conditional to apply. For conditionals that compare values, the value being processed is on the left, and the conditional input is on the right. When data is in the format of numbers or dates in ISO format (YYYY-MM-DD), they will be compared as numbers or dates respectively. Otherwise, they will be compared as strings

Must be one of:

  • “IS_BLANK”
  • “IS_NOT_BLANK”
  • “EQUALS”
  • “DOES_NOT_EQUAL”
  • “CONTAINS”
  • “DOES_NOT_CONTAIN”
  • “GREATER_THAN”
  • “GREATER_THAN_OR_EQUAL_TO”
  • “LESS_THAN”
  • “LESS_THAN_OR_EQUAL_TO”

string

The conditional input value to apply the conditional against.

object

Formats dates of the specified input into yyyy-MM-dd

object Required

Specifies which columns should have their values transformed.

Same definition as transforms_items_fixCapitalization_inputHeaders

string Required

The date format pattern of the input values (in accordance with java.time.format.DateTimeFormatter).

string

The locale of the input values.

object

Merges multiple columns together into a new column separated by a specified delimiter.

array of string Required

The names of the columns to merge together.

Each item of this array must be:

Type: string

The name of the column containing values to merge.

string Required

The new column’s header.

string Required

The delimiter that separates the merged columns’ values.

object

Split a column into one or more columns based on a specified delimiter.

string Required

The column containing the data that needs to be split.

array of string Required

The names of the columns to split into.

Each item of this array must be:

Type: string

The name of the column containing split values.

string Required

The delimiter used to split the column’s value into new ones.

object

Finds empty cells and fills them in with a specified default value.

object Required

Specifies which columns should have their values transformed.

Same definition as transforms_items_fixCapitalization_inputHeaders

string Required

The text that will replace all empty cells.

object

Splits a column into one or more rows based on a specified delimiter.

string Required

The header of the column containing the values to split.

string Required

The character on which the values should be split.

mappingsarray

The mapping generated by the transformers.

Each item of this array must be:

Type: object

The mapping generated by the transformers.

The following properties are required:

  • header
  • field

scheduleConfig

The configuration of the run schedule.

Type: object

The following properties are required:

  • useSourceSchedule
Type: object

The following properties are required:

  • customSchedule

boolean

Indicates that the Connector should run automatically using the data source’s schedule.

object

The custom schedule that the Connector will follow. The Connector can be configured to run daily, weekly, monthly or a custom frequency. Currently, this is only supported for API Connectors.

string Required

The time zone in ISO-8601 Zone-ID to start a run, i.e. America/New_York.

string Required

The local date in ISO format to start scheduling runs, i.e. 2021-01-30.

string Required

The local time in ISO format to start scheduling runs, i.e. 12:00:00.

enum (of string) Required

The schedule config frequency type.

Must be one of:

  • “HOURLY”
  • “DAILY”
  • “MONTHLY”
  • “WEEKLY”

number

The repeat interval based off the schedule config frequency type. If unset, a repeat interval of 1 will be used (e.g. every day).

enum (of string)

Indicates the run mode.

Must be one of:

  • “DEFAULT”
  • “COMPREHENSIVE”