Information
This case study showcases the expertise of ProCogia’s Senior Data Analytics Consultant, Jens Sommerfeld, in optimizing ETL processes and improving Redshift cluster efficiency for the world’s largest retailer.
The Challenge
As the retailer’s operations expanded, so did the size and complexity of its data processes. The time required to run queries and weekly scheduled ETL reports continued to increase to the point where queries for the Weekly Business Review could not be delivered on time. Additionally, ad-hoc reports took longer to execute and, most of the time, would time out without producing any results.
Technical Implementation
Identify all daily and weekly jobs and remove unnecessary queries. For other files review the queries and optimize SQL syntax.
Run a query to identify all automated daily/weekly jobs by run time, final status, who created the job, who updated the job last.
Reach out to all data job owners and have them review if the job is still needed to run every day/week.
Identify employees that are no longer in the org and reach out to the team to identify if the scheduled jobs are still needed.
De-schedule all unneeded jobs.
Review all active queries for improvement opportunities.
Check if Metric jobs can be linked to the same job and remove duplicates.
Reach out to Job owners with improvements and review best practices to avoid further cluster slowdown.
Immediately, 20% of all daily and weekly running jobs were de-scheduled, either due to employees leaving the organization or the jobs no longer being needed.
A focus on the highest runtime jobs revealed several created by SQL beginners. Obvious syntax issues were addressed, and additional questions, such as those related to reporting time frames, were clarified with the creators. These improvements reduced the average runtime for these jobs from 60 minutes to an average of 5 minutes.
For weekly business reviews, several repeating queries were consolidated into one main query, and the rest were de-scheduled. A single job can be tied to several metric jobs, the benefit the query only has to be run once and the results can be shared across metric jobs.
For other queries, Junior Business Analysts were guided on best practices to improve queries, like temp tables, review of joins and creation of static table, etc.
Results
The overall weekly run time was reduced by 80% for the cluster. Also sharing best practices improved the speed to run ad-hoc queries. This procedure was repeated every 3 months to keep run times down to a minimum.