Site Reliability Engineer
New Yesterday
About
Step forward into the future of technology with ZILO.
About
Step forward into the future of technology with ZILO.
We’re here to redefine what’s possible in technology. While we’re trusted by the global Transfer Agency sector, our technology is truly flexible and designed to transform any business at scale. We’ve created a unified platform that adapts to diverse needs, offering the scalability and reliability legacy systems simply can’t match.
At ZILO, our DNA is built on Character, Creativity, and Craftsmanship. We face every challenge with integrity, explore new ideas with a curious mind, and set a high standard in every detail.
We are a team of dedicated professionals where everyone, regardless of their role, drives our progress and creates real impact. If you’re ready to shape the future, let’s talk.
Requirements
About the Role
We’re looking for a Senior Site Reliability Engineer to join our SRE team. This is a hybrid role that blends deep platform engineering with application-level troubleshooting. You’ll be responsible for the stability, performance, and resilience of our cloud-native infrastructure while also being on the front line when issues affect our users and services.
This is a high-impact role ideal for someone who thrives in a modern DevOps culture, cares about both systems uptime and customer experience, and is comfortable working across infrastructure and application layers.
Key Responsibilities️ Infrastructure Reliability & Operations
- Own patching, upgrades, and maintenance of AWS and EKS infrastructure
- Define and implement resilience and failover strategies for microservices and core platforms
- Continuously monitor and improve system performance, cost-efficiency, and observability (LGTM stack / Datadog)
- Partner with security teams on compliance and vulnerability remediation
- Design and execute Chaos Engineering experiments.
- Develop and track SLOs, SLIs, and error budgets for critical systems
- Conduct resilience reviews and game days to validate system behavior under failure
- Ensure Kafka clusters are optimally configured for performance and durability
- Support producers/consumers and troubleshoot event delivery and retention issues
- Monitor and tune partitioning, replication, throughput, and latency
- Respond to production incidents — from user-facing UI errors to backend service disruptions
- Investigate issues across infrastructure, Kubernetes, logs, traces, and service code
- Resolve incidents and support root causes (Java and GoLang services)
- Contribute to postmortems and reliability engineering initiatives
- 5+ years in an SRE, DevOps, or infrastructure role
- Deep hands-on experience with AWS, EKS/Kubernetes, and Terraform
- Working knowledge of Kafka tuning, monitoring, and operational troubleshooting
- Strong familiarity to be able to read code and trace failures in one or more of the following application languages
- Java
- GoLang
- React
- .NET
- Python
Benefits
- Enhanced leave - 38 days inclusive of 8 UK Public Holidays
- Private Health Care including family cover
- Life Assurance – 5x salary
- Flexible working-work from home and/or in our London Office
- Employee Assistance Program
- Company Pension(Salary Sacrifice options available)
- Access to training and development
- Buy and Sell holiday scheme
- The opportunity for “work from anywhere/global mobility”
Seniority level
Seniority level
Mid-Senior level
Employment type
Employment type
Full-time
Job function
Job function
Engineering and Information Technology
Referrals increase your chances of interviewing at ZILO by 2x
Sign in to set job alerts for “Site Reliability Engineer” roles.
London, England, United Kingdom 2 weeks ago
Storage Platform Engineer - FinTech - £150,000-£250,000 + Bonus
London, England, United Kingdom 1 week ago
London, England, United Kingdom 2 months ago
London, England, United Kingdom 1 month ago
Isleworth, England, United Kingdom 2 weeks ago
Site Reliability Engineer at High Growth B2C Startup
London, England, United Kingdom 1 week ago
London, England, United Kingdom 5 days ago
Tottenham, England, United Kingdom 1 month ago
Greater London, England, United Kingdom 2 days ago
London, England, United Kingdom 2 hours ago
Site Reliability Engineer, ML Infrastructure, Large Models SRE
London, England, United Kingdom 1 week ago
London, England, United Kingdom 2 weeks ago
Greater London, England, United Kingdom 5 days ago
London, England, United Kingdom 1 day ago
London, England, United Kingdom 3 days ago
London, England, United Kingdom 1 month ago
London, England, United Kingdom 2 weeks ago
London, England, United Kingdom 1 month ago
London, England, United Kingdom 1 week ago
South Croydon, England, United Kingdom 4 days ago
London, England, United Kingdom 2 weeks ago
London, England, United Kingdom 1 week ago
London, England, United Kingdom 3 months ago
London, England, United Kingdom 5 days ago
London, England, United Kingdom 3 days ago
Surrey, England, United Kingdom 1 week ago
London, England, United Kingdom 2 months ago
Woking, England, United Kingdom 2 days ago
London, England, United Kingdom 3 days ago
London, England, United Kingdom 7 months ago
Site Reliability Engineer – Field Operations
London, England, United Kingdom 3 days ago
London, England, United Kingdom 1 month ago
Senior Site Reliability Engineer - Monitoring and Observability - Macquarie Group
London, England, United Kingdom 1 week ago
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr- Location:
- London, England, United Kingdom
- Salary:
- £150,000 - £200,000
- Job Type:
- FullTime
- Category:
- Engineering
We found some similar jobs based on your search
-
New Yesterday
Senior Site Reliability Engineer (SRE) - C13 - London - Citi
-
London, England, United Kingdom
-
£125,000 - £150,000
- Engineering
Senior Site Reliability Engineer (SRE) - C13 - London - Citi Senior Site Reliability Engineer (SRE) - C13 - London - Citi 4 days ago Be among the first 25 applicants Discover your future at Citi Working at Citi is far more than just a job. A caree...
More Details -
-
New Yesterday
Site Reliability Engineer
-
London, England, United Kingdom
-
£150,000 - £200,000
- Engineering
About Step forward into the future of technology with ZILO. About Step forward into the future of technology with ZILO. We’re here to redefine what’s possible in technology. While we’re trusted by the global Transfer Agency sector, our technology ...
More Details -
-
New Yesterday
Site Reliability Engineer
-
United Kingdom
-
£100,000 - £125,000
- Engineering
Get AI-powered advice on this job and more exclusive features. Direct message the job poster from Areti Group | B Corp Areti are excited to be partnering with a leading and award-winning software house in the search for an experienced Site Reliabilit...
More Details -
-
New Yesterday
Site Reliability Engineer
-
England, United Kingdom
-
£100,000 - £125,000
- Engineering
Profectus Recruitment provided pay range This range is provided by Profectus Recruitment. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more. Base pay range Direct message the job poster from Profec...
More Details -
-
New Yesterday
Senior Site Reliability Engineer
-
United Kingdom
-
£80,000 - £100,000
- Engineering
Join to apply for the Senior Site Reliability Engineer role at ClickHouse Join to apply for the Senior Site Reliability Engineer role at ClickHouse Get AI-powered advice on this job and more exclusive features. About ClickHouse Established in ...
More Details -
-
New Yesterday
Site Reliability Engineer - Blockfrost
-
United Kingdom
-
£100,000 - £125,000
- Engineering
Who are we? IOG, is a technology company focused on blockchain research and development. We are renowned for our scientific approach to blockchain development, emphasizing peer-reviewed research and formal methods to ensure security, scalability, an...
More Details -