Customer Entity Retention Principles
There are 2 approach to Customer Entity Retention which we use: Entity based and Event based.
Entity based
How it works:
If there is customer entity with all identifying attributes IS NULL and no new event for XY seconds, I will delete it.
How to use it:
- In
cdp_attr.attributes
table there is a flagis_identifying
marking if the attribute is identifying or not. By default all are identifying. Turn it from 1 to 0 to mark that the attribute is not identifying (should not be used as in customer retention calculation). - In public.settings there is a row with id
customer_entity_ttl
with valueinteger
marking the time of inactivity (in seconds) after which it removes the entity
Example: There is flag is_identifying = 0
for all attributes except for email and phone attributes. In public.settings table there is a row with id customer_entity_ttl
= 7776000 (90 days in seconds). There is customer entity A with both email and phone attributes empty (IS NULL
in pivoted_customer_attributes
) and now() - max(event_time)
from all events belonging to this entity is greater than 7776000 -- entity is removed. There is customer entity B with now() - max(event_time)
from all events belonging to this entity greater than 7776000, attribute phone is NULL
, but attribute email has a value -- entity IS NOT REMOVED. In order to remove customer entity both conditions (all identifying attributes IS NULL, no activity for XY seconds) must be met in order for the entity to be removed.
Events based
How it works:
If there are events older than XY seconds, remove them.
How to use it:
There is a table cdp_ce.events
with columns id
and ttl
(time-to-live). Add a new row for each event ID you want to remove with TTL in seconds.
Example: There is a setting for event ID of page_view from ME with ttl = 7776000 (90 days in seconds). Customer entity has customer events assigned to it of type page_view which are older than 90 days, they get removed. No other events have record in this table, they will NOT be removed.