Site Reliability Engineer (SRE) - Application Support
New Today
About
Step forward into the future of technology with ZILO™.
We’re here to redefine what’s possible in technology. While we’re trusted by the global Transfer Agency sector, our technology is truly flexible and designed to transform any business at scale. We’ve created a unified platform that adapts to diverse needs, offering the scalability and reliability legacy systems simply can’t match.
At ZILO™, our DNA is built on Character, Creativity, and Craftsmanship. We face every challenge with integrity, explore new ideas with a curious mind, and set a high standard in every detail.
We are a team of dedicated professionals where everyone, regardless of their role, drives our progress and creates real impact. If you’re ready to shape the future, let’s talk.
Requirements
We’re looking for a Site Reliability Engineer to join our SRE team — someone who thrives on solving complex production issues, understands how applications behave in the real world, and takes pride in keeping systems reliable and performant.
This is not a platform engineering role. You won’t just be spinning up Kubernetes clusters or building infrastructure — you’ll be deeply involved in understanding our applications, what they do and how they operate, troubleshooting real-world issues, and working directly on improvements that impact our customers every day.
What You’ll Do
Incident Response & Troubleshooting: Investigate and resolve incidents raised by clients, diving into logs, metrics, and application code to identify root causes.
Application Debugging: Work across our core stack — Java, Golang, and Python — to trace and fix issues affecting reliability or performance.
Data Fixes: Perform data investigation and fixes using Postgres.
Operational Excellence: Patch and maintain Kubernetes clusters and other production systems.
SRE Roadmap: Contribute to the continuous improvement of our observability, reliability, and automation initiatives.
This role is hybrid and will require regular weekly attendance at our London office.
Qualifications
Solid experience with application debugging in at least one of: Java, Golang, or Python.
A good grasp of PostgreSQL — enough to run queries, analyse data, and perform safe fixes.
Familiarity with Kubernetes and modern cloud platforms (AWS, GCP, or Azure).
Understanding of incident management, observability tools (Grafana, Prometheus, etc.)
A mindset focused on reliability, quality, and ownership.
Benefits
Enhanced leave - 38 days inclusive of 8 UK Public Holidays
Private Health Care including family cover
Life Assurance – 5x salary
Flexible working - work from home and/or in our London Office
Employee Assistance Program
Company Pension (Salary Sacrifice options available)
Access to training and development
Buy and Sell holiday scheme
The opportunity for “work from anywhere/global mobility”
#J-18808-Ljbffr
- Location:
- City Of London, England, United Kingdom
- Job Type:
- FullTime