Data Governance, Videos, Article, Blog

Data Deduplication factory

Jared Hillam

Jared Hillam

August 24, 2017


We recently had a discussion with a financial services company which had several concerns about moving to the cloud. But surprisingly, these concerns did not come from the technical staff, but rather their business leaders. "I just don't trust it" was the statement from one of their executives. So, this article is for all those

business executives that are being pitched a move to the cloud, but feel uneasy about it.

In the magic land of Simpleville your entire company lives under 1 enterprise application system. All the customer records live in one place and everybody around you uses the same systems you do, and you get to ride a unicorn past all the traffic on your way home. At least that’s what the ERP vendors would have us believe.

And then there’s the real world. Customer records live in dozens of application systems. Your ERP is just 1 of 20 mission critical systems that operate your business which contain customer data, despite all your ERP standardization efforts. A single customer record could live in all of those systems at the same time, each containing a slight variation. Imagine needing to send a mailer to all your customers. How would you do it? Would you mail all of them, knowing that you’re going to end up wasting millions of dollars in repeat mailers? No doubt there are ivory tower efforts to solve this problem 3-10 years from now… but how do you address this issue now?


Intricity is here to propose a simple cloud solution to this duplicate record problem, with something called Data Deduplication Factory (DDF). When you first connect the DDF, its guided machine learning capabilities will present its best guess of duplicate records. As you provide the machine learning feedback on matches and misses, it will learn from each training loop, and ultimately perform autonomously. Once you have the trained results, DDF allows you to integrate the results via a RESTful web service which can be integrated into your enterprise applications and use cases. Additionally, you can pull the results in multiple forms like .csv and JSON.

Intricity prices the DDF by data volume. If you would like to get an estimate and a demo, click here. 

Related Post

What is a Partition?

Understanding the concept of database partitioning can be significantly illuminated by the historical context of hard drive defragmentation.

Learn More

The Narrow Case for Data-to-Information Company Acquisitions

The rumors about Salesforce acquiring Informatica bring up some interesting observations from past acquisitions of this nature.

Learn More

CI/CD for Data Projects

Despite its prevalence in software development, CI/CD is less common in data projects due to differences in pace and cultural perception. Discover the importance of implementing CI/CD in...

Learn More