ePrivacy and GPDR Cookie Consent by Cookie Consent Skip to main content

Profile Stitching edge cases and solutions (wip)

This article describes the profile stitching (PS) issues / edge cases Analysts have faced during implementation. We hope that it can serve as a reference point on the possible solutions to PS.

General tips

Remove trailing and leading whitespaces, and use lower case for emails

LOWER(TRIM(payload ->> 'email'))

Validate your emails and phone numbers during data processing

# email regular expression
import re
EMAIL_REGEX = r'(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)'
email_pattern = re.compile(EMAIL_REGEX)
if email_pattern.fullmatch(email):	
	return email.strip().lower()
# phonenumbers package can parse phone numbers with country code and split it to country code and phone number
import phonenumbers
x = phonenumbers.parse(phone_number)
event['payload']['country_code'] = x.country_code
event['payload']['phone_number_validated'] = x.national_number

Issue 1: Physical/In-store events

Some clients have a physical store or have in-store events for customers. In most cases, a tablet is used by multiple customers for browsing, logging in and making purchases. This results in many different customers having the same meiro user id because they used the same device, which causes large entities during PS

Possible fixes:

fixes and prevention:
  • Do not stitch meiro user id for these problematic events
  • Add flags/geolocation fields (like IP addresses) and update PS rules to exclude the appropriate fields
-- in this example, the offline flag is used to show if the event was in-store (offline = '1') or online (offline = '0')
CASE 
	WHEN payload->'payload'->>'offline' = '1'
    THEN NULL 
	ELSE payload->>'user_id' 
END
  • Process change - allow for browsing on incognito mode only. Close and open the incognito window for each new customer. The action of closing and re-opening incognito windows will generate new meiro user ids, ensuring that different customers will have different meiro user ids.
  • In Meiro Events, implement MeiroEvents.resetIdentity() on specific triggers, like on a log out event. When MeiroEvents.resetIdentity() method is called, the tracked meiro user id will be refreshed, so another customer using the same device will be tracked under a new meiro user id.

Issue 2: Emails with same tracking ID opened by different users

Many Email Direct Marketing (EDM) platforms now decorate their emails links with an ID to track user's browsing activity. This ID is usually unique per email campaign and per email address. However, users may forward their email or email links containing this ID to other users, which causes all such users to be stitched together because they have the same email ID.

Possible fix:

Fix:
  • Edit PS rule to only stitch the FIRST instance of email ID, ignoring all subsequent events with the same email ID. This can be done by add specific rule to value_expression_prep in cdp_ps.stitching_attributes_config for the event you would like to fix. Example of the rule:
CASE
    -- a general check on the id to ensure it has 32 characters
    WHEN
        length(payload->'url_parts'->'query_parsed'->'sid'->>0)=32
    THEN
        CASE
			-- compares the list of identifiers already stitched (m.identifiers and p.identifiers), against the identifier
            -- if it has been stitched already (count is > 0), then do not stitch this identifier
            -- here, 'stitching_se_sid' is the attribute_id and
            -- st.payload->'url_parts'->'query_parsed'->'sid'->>0 is the identifier in the payload
			WHEN 
            (
                (
                    SELECT COUNT(*)
                    FROM cdp_ps.matching m
                    INNER JOIN cdp_ce.customer_events ce ON ce.id::uuid = m.customer_event_id
                    WHERE ce.source_id = 'eshop_website' AND ce."type" = 'page_view'
                    AND m.identifiers && ARRAY [md5('stitching_se_sid' || (st.payload->'url_parts'->'query_parsed'->'sid'->>0))::uuid]
                ) > 0
            ) OR
            (
                (
                    SELECT COUNT(*)
                    FROM cdp_ps.prep p
                    INNER JOIN cdp_ce.customer_events ce ON ce.id::uuid = p.customer_event_id
                    WHERE ce.source_id = 'eshop_website' AND ce."type" = 'page_view'
                    AND p.identifiers && ARRAY [md5('stitching_se_sid' || (st.payload->'url_parts'->'query_parsed'->'sid'->>0))::uuid]
                ) > 0
            )
        THEN
            NULL
        ELSE
            payload->'url_parts'->'query_parsed'->'sid'->>0
    END
END