About The Client

The Client, who is the leading off-price apparel and home fashions retailer in the U.S. and worldwide, was ranked 85 in the 2019 Fortune 500 company listings. At the end of 2019, the Client had nearly $42 billion in annual sales, more than 4,500 stores in nine countries, four e-commerce sites, and approximately 286,000 associates. 

About Intricity

Intricity is a team of specialized data management, data warehousing, and business intelligence experts. The team members at Intricity have been handpicked over the course of 20 years and represent the top talent globally in data-oriented disciplines. 

 

Migration: Challenges & Wins

Challenge

The Client had used DataStage extensively for many years.  However, the adoption of the cloud created a landscape that DataStage was not originally designed for. Additionally, the cost of licensing the ETL environment was prohibitive for the organization. After studying the capabilities of the available ETL platforms, Talend was selected as the target platform. However, the primary barrier to its adoption was the massive footprint that the existing DataStage code had. Replacing all that DataStage code by hand was going to be a massive effort. To determine the cost of the effort, the Client received a quote from its trusted System Integration (SI) partner. However, the quote came to ~$20,000,000. Additionally, the SI’s quote came with a contingency that required a study of the source systems which were quoted at an additional ~$300,000 in cost. The sheer cost of the endeavor made the CIO of the organization tell the cloud providers that the project was on hold unless a more viable refactoring effort could be uncovered.

Navigating Constraints: 

  • The Client codebase was truly enormous with both DataStage and Netezza needing to be migrated to Talend and Snowflake respectively.
  • The conceptual differences between DataStage and Talend were no small issue; the architect of the project compared it to turning “Jupiter into a banana”.
  • The footprint of mainframe transformations was something that could be supported in Talend but did not have the same level of tuning as IBM had put into DataStage. This was not something that the Talend product teams had in their roadmap.

Win 1: Analyzer

The Snowflake representative reached out to Intricity after the CIO decided to put everything on hold. Intricity presented the BladeBridge code conversion product and how to size the effort with Analyzer. The Client team provided Intricity with the metadata from their DataStage environment and the SQL from Netezza and Intricity ran Analyzer for free to generate a full inventory of all the jobs and the count of their complexity. This analysis also allowed Intricity to generate an empirical tie to its services quote. The quote was ~$3,000,000 – a fraction of the original SI quote. With the results connected to the empirical counts and the complexity findings, both Intricity and the Client were confident in the findings. Intricity had provided the Analyzer results to the Client for free which was something the competing SI proposed as a study for $300,000.

Win 2: Code Migration

In joint collaboration with the BladeBridge and Client teams, Intricity converted the enormous quantities of code from DataStage to Talend. The Client team handled the data testing tasks while Intricity and BladeBridge converted the code for unit test readiness. The BladeBridge code converter allowed the Intricity team to migrate in pattern sets rather than individual code snippets by hand. The automation provided a massive decrease in manual effort. For each iteration, the BladeBridge configurations for converting DataStage and Talend would get further conditioning, automating ~80% of the code migration process. The latter ~20% represented code that was too low repetitiously to adapt to the BladeBridge converter and did not pass a unit test. These jobs usually only required some manual tweaks to fully convert as the core had been converted by BladeBridge. The Analyzer results acted as an effective inventory and tracking tool during the code conversion effort.

Win 3: Adaptation to Mainframe Files

Awareness that mainframe transformations were going to pose a performance problem in Talend occurred about 4 months into the conversion project and this caused no small concern with all parties involved. Since the mainframes were an IBM product, an outsized effort had been made by IBM to ensure that DataStage had such tuning for processing such data. The Talend product team was unable to make such core changes to the Talend engine in the timeframe of the project timelines.

Intricity and BladeBridge came up with a workaround leveraging the flexibility of the BladeBridge configuration platform. The transformations which carried these mainframe transformation jobs were offloaded to a PySpark cluster and then, when finished processing, were re-injected into the Talend workflow. This workaround provided the speed necessary to process the mainframe files and the generation of code for this maneuver was done within the BladeBridge tooling. This was a testament to the range of code generation flexibility provided by BladeBridge.

Win 4: Converted Code in Snowflake

The Client was able to leverage the power of Snowflake’s compute layer and converted all their DataStage and Netezza assets to Talend and Snowflake. This gave the client the power to roll out analytics that were no longer constrained by hard-wired compute and storage limitations.