ePrivacy and GPDR Cookie Consent by Cookie Consent Skip to main content

Load event to CDP

A component for loading data into the CDP data model

This component provides set of primitives for interacting with the CDP database - tables customer_events and attribute_tags. For using the component use this link to repo: https://github.com/meiroio-components/cdp-db-loader.git

The order of the columns in the input CSV DOES matter!!!! and it's checked at runtime

Configuration

PostgreSQL backend

{
"action": ""insert_customer_events",
"debug": true,
"dry_run": false,
"backend": "postgres",
"auth": {
"host": "dbhost"{{DB_HOST}}",
"dbname": "dbname"{{DB_NAME}}",
"user": "dbuser"{{DB_USER}}",
"password": "{{#DB_PASSWORD}}"
},
"schema": {
"cdp": "public"{{CDP_SCHEMA}}",
"customer_events": "cdp_ce"{{CDP_SCHEMA_CUSTOMER_EVENTS}}",
"profile_stitching": "cdp_ps"{{CDP_SCHEMA_PS}}"
}
}
  • dry_run (default false); if true do not commit any changes to the database. If false changes are commited after all input files are processed
  • auth.cdp_schema - optional if, for some reason, the data model doesn't reside in the public schema this sets the search_path to the specified value (which means the default $user,public value doesn't apply anymore). Only use this if you know what you are doing.

    Common config

    All CVSs in in/tables/*.csv are processed according to the action parameter.

    debug defaults to true

    Available actions are:

    • insert_customer_events
    • upsert_customer_events
    • upsert_attribute_tags

    Actions

    Loading new customer_events

    "action": "insert_customer_events"

    This inserts new events, if there is a conflict on id column, skip this row altogether.

    The event_id must be already defined in the events table before upload

    The structure of the input file:

    id,customer_entity_id,event_id,source_id,event_time,type,version,payload,created
    "c",,md5(source_id || type || version),"test_source",2018-06-23T12:34:56Z,"subscribed","1-0-0","{""foo"": 126}","2018-06-23T12:34:56Z"
    "d",,md5(source_id || type || version),"test_source",2018-06-23T12:34:56Z,"purchase","1-0-0","{""foo"": 126}","2018-06-23T12:34:56Z"
    

    Upserting customer events

    "action": "upsert_customer_events"

    This inserts new events, if there is a conflict on id column, updates payload and leaves everything else untouched.Input file should have the same structure as for loading new customer events (customer_events.csv).

    It can be useful when you have a set of events loaded in the database and later you find out that the payloads are missing some field and you want to backfill the data.

    This must be done in a backward compatible way (adding new field to the json shouldn't break any follow-up customer attributes). Before using this, make sure you know what you are doing and the implications of it.

     

    Upserting attribute tags

    "action": "upsert_attribute_tags"

    This upserts tag_ids for the attributes in the attribute_tags table (PII data, contact information tags, etc). Doesn't delete anything from the table.

    The structure of the input file:

    attribute_id,tag_id
    mailchimp_email,my_tag_1