Load events to CDP
Learn from this article:
A component for loading data into the CDP data model |
Configuration: PostgreSQL backend and common config |
Actions: Loading new customer_events, Upserting customer events, Upserting attribute tags |
A component for loading data into the CDP data model
This component provides a set of primitives for interacting with the CDP database - tables customer_events
and attribute_tags
. For using the component use this link to repo: https://github.com/meiroio-components/cdp-db-loader.git
Warning: The order of the columns in the input CSV DOES matter(!) and it's checked at runtime.
Configuration
PostgreSQL backend
{
"action": "insert_customer_events",
"auth": {
"dbname": "{{CDP_DB_NAME}}",
"host": "{{CDP_DB_HOST}}",
"password": "{{#CDP_DB_PASSWORD}}",
"port": "{{CDP_DB_PORT}}",
"user": "{{CDP_DB_USER}}"
},
"backend": "postgres",
"debug": true,
"dry_run": false,
"schema": {
"cdp": "public",
"customer_events": "cdp_ce",
"profile_stitching": "cdp_ps"
}
}
dry_run (default false )
|
If |
auth.cdp_schema |
Optional if, for some reason, the data model doesn't reside in the public schema this sets the |
Common config
All CVSs in in/tables/*.csv
are processed according to the action
parameter.
debug
defaults to true
Available actions are:
- insert_customer_events
- upsert_customer_events
- upsert_attribute_tags
Actions
Loading new customer_events
"action": "insert_customer_events"
Inserts new events. If there is a conflict on id
column, skip this row altogether.
Remember: The event_id
must be already defined in the events
table before upload.
The structure of the input file:
id,customer_entity_id,event_id,source_id,event_time,type,version,payload,created "c",,md5(source_id || type || version),"test_source",2018-06-23T12:34:56Z,"subscribed","1-0-0","{""foo"": 126}","2018-06-23T12:34:56Z" "d",,md5(source_id || type || version),"test_source",2018-06-23T12:34:56Z,"purchase","1-0-0","{""foo"": 126}","2018-06-23T12:34:56Z"
Upserting customer events
"action": "upsert_customer_events"
Inserts new events. If there is a conflict on id
column, updates payload
and leaves everything else untouched. Input file should have the same structure as for loading new customer events (customer_events.csv
).
It can be useful when you have a set of events loaded in the database and later you find out that the payloads are missing some field and you want to backfill the data.
Warning: This must be done in a backwards-compatible way (adding new field to the json shouldn't break any follow-up customer attributes). Before using this action, make sure you udnerstand what you are doing and the implications of it.
Upserting attribute tags
"action": "upsert_attribute_tags"
Upserts tag_ids
for the attributes in the attribute_tags
table (PII data, contact information tags, etc). Doesn't delete anything from the table.
The structure of the input file:
attribute_id,tag_id mailchimp_email,my_tag_1