Reactor Log Analysis Tool
Over my three month internship at Amazon in Vancouver, I gained valuable experience working in my first software development position through the implementation of a Reactor Log Analysis Tool as an intern project. Within my team’s service, there are numerous different processing steps performed on offer data when a merchant makes an update to their listings. While the team has numerous dashboards for monitoring the performance of individual operations, there did not yet exist any method for examining the performance of the system as a whole, especially when considering the chain of operations triggered by a single merchant update. The Reactor Log Analysis Tool was thus proposed as a solution for enabling wider insights on the performance of full operation sequences.
Given that the service operates on the scale of TB/hour, the Reactor Log Analysis Tool had to be designed to support such load. Through research and prototyping, I leveraged AWS technologies such as ElasticMapReduce, RedShift, and Lambda to support and automate the proposed data processing pipeline at scale. I used a combination of Scala, Python, SQL, and TypeScript languages to implement this pipeline. AWS QuickSight was used to support the dashboard visualization of the log analysis summaries. At the end of the internship, I showcased my work through a 20-minute presentation to software developers and managers from the organization. Using my project, I showed reductions in theoretical time-to-resolution for a historical high-severity ticket from dev-hours to dev-minutes and identified a group of clients producing potential inefficiencies within the system.