Transform events (CR)

Transform events into suitable format for customer_events table

The purpose of this component is to prepare the events' format for loading the events into the customer_events table.

CSV structure

Before loading data into the customer_events table, we would need to transform the data into a specific structure so that it is ready for loading.

Here are the columns required in the csv, their data type and the values expected

Column Name	Type	Values
source_id	string	The source id of a source defined under Sources in Meiro Business Explorer It should be in snake casing and human readable.
type	string	The type of event defined under Events in Meiro Business Explorer It should be in snake casing and human readable.
version	string	The version of event defined under Events in Meiro Business Explorer It should follow this format X-X-X. Example: 1-0-0
user_identifiers	string	A concatenated string of identifiers of a customer. It can be a single identifier (like an email or phone number) or a combination of identifiers. Example: john.doe@meiro.io, john.doe@meiro.io420987654321
event_time	string - in datetime format	A datetime string in this format 'YYYY-MM-DD HH:MM:SS' This event time must be in UTC timezone
payload	json	This is where the bulk of the data goes. So long as the payload is a valid json format, anything goes.

Remember: Events should be supplied in csv format, with these 6 columns.
source_id, type, version, user_identifiers, event_time and payload

Sample code in Python

For example, given a file in data/in/files/events_mc_subscribed.ndjson with the following 2 rows:

{"email":"robin@meiro.io", "meta": {"date": "2018-08-18T14:15:16Z"}, "status": "subscribed", "list_id": "12345b", "list_name": "Loyal customers"}
{"email":"foo@bar.io", "meta": {"date": "2018-08-18T15:16:17Z"}, "status": "subscribed", "list_id": "12345b", "list_name": "Loyal customers"}

The goal is to produce data/out/tables/events_mc_subscribed.csv, which can be uploaded to customer_events table:

source_id	type	version	user_identifiers	event_time	payload
mailchimp	subscribed	1-0-0	robin@meiro.io	2018-08-18T14:15:16Z	{"email":"robin@"... - 1:1 copy of original}
mailchimp	subscribed	1-0-0	foo@bar.io	2018-08-18T15:16:17Z	{"email":"foo@bar.io"... - 1:1 copy of original}

Here is a sample code in python to do this transformation.

import json, csv
from os.path import exists

if exists('/data/in/files/events_mc_subscribed.ndjson'):
    with open('/data/in/files/events_mc_subscribed.ndjson','r') as inf, open('/data/out/tables/events_mc_subscribed.csv', 'w') as outf:
        outcsv = csv.DictWriter(outf, ['source_id', 'type', 'version', 'user_identifiers', 'event_time', 'payload'])
        outcsv.writeheader()
        for l in inf:
            jl = json.loads(l)
            if not jl['user_id']:
                continue
            outcsv.writerow({
                'source_id': 'mailchimp',
                'type': 'subscribed',
                'version': '1-0-0',
                'user_identifiers': jl['email'],
                'event_time': jl['meta']['date'],
                'payload': json.dumps(jl)
            })

Sample code in SQL

It is also possible to skip the transformation step in Python and transform the data during extraction with a SQL script. Below is an example.

Assume we have mc_subscription_status table with these data:

email	status	list_id	list_name	status_date
robin@meiro.io	subscribed	12345b	Loyal customers	2018-08-18T14:15:16Z
foo@bar.io	subscribed	12345b	Loyal customers	2018-08-18T15:16:17Z

Here is a sample code in python to do this transformation.

WITH base as (
  SELECT email, status, list_id, list_name, status_date
  FROM mc_subscription_status
  )
SELECT 
	'mailchimp' as source_id,
    'subscribed' as type,
    '1-0-0' as version,
    status_date as event_time,
	email as user_identifiers,
    to_json(base)
FROM base
WHERE status = 'subscribed'

Meiro Implementation Training

SOPs for Project Implementation

How to Request for FBM Access to Clients' Accounts

How to set up MI and ME API connection in CDP

Data Audit Checklist

Format events

Transform events (CR)

Load events to CDP

Load events to CDP (CR)

Customer Entity Retention Principles

How to Implement Meiro Events in MI and CDP and set up alerts for Meiro Events

Identity Stitching Setup

Identity Stitching rules

How to do Quality Checking on Identity Stitching

Identity Stitching edge cases and solutions

How to rerun identity stitching for all/some entities/events and recalculate attributes

Problems with large customer entities and how to investigate and solve them

Description

Attribute data types and sample values

Compound attributes

How to create context queries that require information of every customer entity

How to troubleshoot when there are no attribute values

How to delete attributes correctly

How to set up attributes in Meiro Business Explorer

How to generate simulated attributes values for demo purpose

Tutorial: How to set up RFM attributes (WIP)

Set export destinations

Facebook Custom Audience destination - upsert

The Trade Desk export destination

Google Analytics Destination

Criteo Destination

Twitter Ads Audience Destination

Google Analytics 4 Destination

Facebook Conversion API Implementation

How to find credentials for GA4 destination

Step-by-step guide for creating reports in Metabase

Set Up Business Explorer Instance & QA

Workspace documentation

Workspaces Monitoring Alert & Analyst Duty

Best Practices for Insights Creation

Component Testing and Documentation guide

Best Practices for Attributes Creation

Best Practices when setting up Destinations

Tutorial: How to implement product feed

Project Handover Procedures

Project Management Handbook

Project Management FAQ

How PMs work with Jira

Transform events (CR)

Transform events into suitable format for customer_events table

CSV structure

Sample code in Python

Sample code in SQL

No Comments