Load event to CDP
A component for loading data into the CDP data model
This component provides set of primitives for interacting with the CDP database - tables customer_events
and attribute_tags
.
The order of the columns in the input CSV DOES matter!!!! and it's checked at runtime
Configuration
PostgreSQL backend
{ "parameters": { "action": "", "debug": true, "dry_run": false, "backend": "postgres", "auth": { "host": "", "dbname": "postgres", "user": "postgres", "password": "", "cdp_schema": null #optional } } }
dry_run
(defaultfalse
); iftrue
do not commit any changes to the database. Iffalse
changes are commited after all input files are processedauth.cdp_schema
- optional if, for some reason, the data model doesn't reside in the public schema this sets thesearch_path
to the specified value (which means the default$user,public
value doesn't apply anymore). Only use this if you know what you are doing.
Common config
All CVSs in in/tables/*.csv
are processed according to the action
parameter.
debug
defaults to true
Available actions are:
- insert_customer_events
- upsert_customer_events
- upsert_attribute_tags
Loading new customer_events
"action": "insert_customer_events"
This inserts new events, if there is a conflict on id
column, skip this row altogether.
The event_id
must be already defined in the events
table before upload
The structure of the input file:
id,customer_entity_id,event_id,source_id,event_time,type,version,payload,created "c",,md5(source_id || type || version),"test_source",2018-06-23T12:34:56Z,"subscribed","1-0-0","{""foo"": 126}","2018-06-23T12:34:56Z" "d",,md5(source_id || type || version),"test_source",2018-06-23T12:34:56Z,"purchase","1-0-0","{""foo"": 126}","2018-06-23T12:34:56Z"
Upserting customer events
"action": "upsert_customer_events"
This inserts new events, if there is a conflict on id
column, updates payload
and leaves everything else untouched.Input file should have the same structure as for loading new customer events (customer_events.csv
).
It can be useful when you have a set of events loaded in the database and later you find out that the payloads are missing some field and you want to backfill the data.
This must be done in a backward compatible way (adding new field to the json shouldn't break any follow-up customer attributes). Before using this, make sure you know what you are doing and the implications of it.
Upserting attribute tags
"action": "upsert_attribute_tags"
This upserts tag_ids
for the attributes in the attribute_tags
table. Doesn't delete anything from the table.
attribute_id,tag_id mailchimp_email,my_tag_1