Data compliance refers to the adherence to regulatory requirements and guidelines related to the data processing lifecycle of personal data, from collection, processing, storage, and sharing. With the increased legislation surrounding General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA), it is crucial that all businesses meet the requirements set for properly handling the personal data of their customers. Companies that do not comply run the risk of severe fines and legal action; GDPR, CCPA, and data compliance are not optional, a mere suggestion or a simple to-do.
As of May 25, 2018, the European Union (EU) GDPR regulation came into effect. It aims to protect the privacy of EU citizens by regulating the collection, processing and storage of personal data. Regardless of location, this regulation applies to all organizations that process or store the personal data of EU citizens. The GDPR requires organizations to obtain explicit consent from individuals before collecting their personal data, provide them with the right to access, rectify and delete their data and notify them of any data breaches.
The California state law, CCPA, came into effect on January 1, 2020. It aims to provide California residents with greater control over their personal information and data by businesses operating in the state. There are specific requirements for certain California businesses, such as those with an annual gross revenue of over $25 million, derive at least 50% of their annual revenue from the sale of California residents and so on. Because of CCPA, California residents have the right to access, delete and opt-out of the sale of their personal data.
With the responsibility to properly handle customer data, this raises questions: how do companies know if they are GDPR and CCPA compliant? And if not, how do they achieve that standard to not only safeguard their customers but also themselves?
More simply, how can companies be data compliant and remove customers who decide to be "forgotten" permanently, instantly and completely?
This task may seem daunting, but thankfully there's a range of measures that organizations can enforce. Here are some key steps that organizations can take to comply with GDPR and CCPA:
Now imagine someone unsubscribes from an email generated by a marketing automation tool from the Services division. That tool is one of 100 applications a business uses. Most of them also have a record placeholder for a contact. Some send emails out to people as well. So a few days go by and the person gets another email from the business's Digital Products division and the individual, yet again, unsubscribes. Then the individual receives yet another email from another automated system and another and another. This is partially what the GDPR and CCPA laws sought to address. However, solving this problem isn't a casual button press. These systems all have varied versions of a customer interaction and they all act independently from each other.
Intricity has seen entire teams dedicated to trying to fix this problem with little success. Teams would centralize all the data into one place and run a series of SQL queries attempting to match records. They might match email addresses which for a single contact might show the following breakdown of 3 contacts:
Then there's a computation problem. Yes, it could be possible if businesses only did this against a single record. But if businesses do this against 500 million records, the problem becomes an exponential compute issue with every record compared to every other record.
The most common solution in the market is to have an organization send their customer data over to a 3rd party. The 3rd party has a giant graph database that the records get compared to to identify duplicate records. Then the 3rd party ships the records back with consolidated identities to each duplicate. This interaction will happen with the latency of a roughly 24-hour period. The duration to set up a legal contract between the two parties is often over 6 months (corporations aren't going to casually cough up their customer data to a 3rd party).
There's also the cost which can often be the most prohibitive part. It's not uncommon for these 3rd parties to charge a per-record fee meaning a typical deal can easily be around $500,000 annually.
While they may not know it, Snowflake customers are sitting on a giant processing beast. Snowflake is capable of scaling up and processing a near limitless amount of data. All it needs is the commensurate code to do so.
Truelty is a code generator that resides within Snowflake deployments which auto-generates the code for processing duplicate identities for both equality and fuzzing matches. By implementing it locally, organizations never have to ship their data outside the "4 walls" of their data warehouse using the processing power of their Snowflake instance.
As Truelty's generated queries execute, a new ID gets generated which clusters the disparate versions of a customer into a singular ID.
The other added benefit is that the processing is so highly tuned that the clustering can be run intra-day.
With the unique records identified, when someone unsubscribes form one system, the user's ID can quickly be cross-referenced with other IDs of that individual. The unsubscribe flags can be conducted manually to start, then scripted into each application system.
While GDPR and CCPA creates a headache for organizations to manage, most people understand the purpose behind data compliance and the right to be forgotten. Having the tools that can narrow down the impact of an unsubscribe and automate the identification of duplicates makes the process easy to adopt. Intricity can help organizations set up this process by automating duplicate identification and flagging application systems automatically.