Site Reliability Engineer (SRE) - Application Support

New Today

About Step forward into the future of technology with ZILO™. We’re here to redefine what’s possible in technology. While we’re trusted by the global Transfer Agency sector, our technology is truly flexible and designed to transform any business at scale. We’ve created a unified platform that adapts to diverse needs, offering the scalability and reliability legacy systems simply can’t match. At ZILO™, our DNA is built on Character, Creativity, and Craftsmanship. We face every challenge with integrity, explore new ideas with a curious mind, and set a high standard in every detail. We are a team of dedicated professionals where everyone, regardless of their role, drives our progress and creates real impact. If you’re ready to shape the future, let’s talk. Requirements We’re looking for a Site Reliability Engineer to join our SRE team — someone who thrives on solving complex production issues, understands how applications behave in the real world, and takes pride in keeping systems reliable and performant. This is not a platform engineering role. You won’t just be spinning up Kubernetes clusters or building infrastructure — you’ll be deeply involved in understanding our applications, what they do and how they operate, troubleshooting real-world issues, and working directly on improvements that impact our customers every day. What You’ll Do
Incident Response & Troubleshooting: Investigate and resolve incidents raised by clients, diving into logs, metrics, and application code to identify root causes. Application Debugging: Work across our core stack — Java, Golang, and Python — to trace and fix issues affecting reliability or performance. Data Fixes: Perform data investigation and fixes using Postgres. Operational Excellence: Patch and maintain Kubernetes clusters and other production systems. SRE Roadmap: Contribute to the continuous improvement of our observability, reliability, and automation initiatives.
This role is hybrid and will require regular weekly attendance at our London office. Qualifications
Solid experience with application debugging in at least one of: Java, Golang, or Python. A good grasp of PostgreSQL — enough to run queries, analyse data, and perform safe fixes. Familiarity with Kubernetes and modern cloud platforms (AWS, GCP, or Azure). Understanding of incident management, observability tools (Grafana, Prometheus, etc.) A mindset focused on reliability, quality, and ownership.
Benefits
Enhanced leave - 38 days inclusive of 8 UK Public Holidays Private Health Care including family cover Life Assurance – 5x salary Flexible working - work from home and/or in our London Office Employee Assistance Program Company Pension (Salary Sacrifice options available) Access to training and development Buy and Sell holiday scheme The opportunity for “work from anywhere/global mobility”
#J-18808-Ljbffr
Location:
City Of London, England, United Kingdom
Job Type:
FullTime

We found some similar jobs based on your search