Lead Site Reliability Engineer
14 Days Old
Join to apply for the Lead Site Reliability Engineer role at Venquis.
Location: London UK / Hybrid / Remote
Sector: Media & Streaming Technology
A leading TV streaming platform is expanding its engineering team to deliver high-performance, low-latency streaming to millions of viewers worldwide. We’re looking for a Lead Site Reliability Engineer (SRE) to drive reliability, observability, and scalability across our streaming services while mentoring a team of SREs.
What you’ll do
- Lead end-to-end reliability strategy for video streaming pipelines, playback services, and backend systems.
- Build and maintain observability frameworks (Prometheus, Grafana, Datadog, OpenTelemetry) to monitor streaming quality, latency, and uptime.
- Scale cloud-native infrastructure (AWS/GCP/Azure) and orchestrate containerised applications (Kubernetes, Docker) for global distribution.
- Guide incident management, disaster recovery, and post-mortems across multi-region streaming environments.
- Mentor junior SREs and collaborate with engineering teams to embed reliability by design into all development efforts.
What we’re looking for
- Proven experience in high-scale distributed systems, preferably in streaming, media delivery, or content platforms.
- Deep expertise with observability, monitoring, and incident response at global scale.
- Strong cloud skills (AWS, GCP, Azure) and Infrastructure as Code (Terraform, Ansible, CI/CD pipelines).
- Proficiency in Python, Go, Java, or Bash for automation and tooling.
- Leadership experience managing or mentoring an SRE or reliability engineering team.
This role offers the opportunity to shape the reliability and performance of a platform watched by millions, balancing real-time user experience with operational excellence.
Compensation: Great package (Base + Bonus)
Details
- Seniority level: Mid-Senior level
- Employment type: Full-time
- Job function: Engineering and Information Technology
Venquis is acting as an Employment Agency in relation to this vacancy.
#J-18808-Ljbffr- Location:
- London, England, United Kingdom
- Salary:
- £150,000 - £200,000
- Job Type:
- FullTime
- Category:
- IT & Technology
We found some similar jobs based on your search
-
New Today
Lead Site Reliability Engineer
-
North East, England, United Kingdom
-
£100,000 - £125,000
- IT & Technology
Lead Site Reliability Engineer Location: Remote working, with 1 day in every 2 weeks at our Stoke-On-Trent office (5 mins from station). Salary: Competitive + company benefits (Full-time/permanent role) About Click Dealer: At Click Dealer, we’r...
More Details -
-
2 Days Old
Lead Site Reliability Engineer
-
Stoke-On-Trent, England, United Kingdom
-
£100,000 - £125,000
- IT & Technology
Job Description Lead Site Reliability Engineer Location: Remote working *1 day in every 2 weeks at our Stoke-On-Trent office (5 mins from station). Salary: £Competitive + company benefits (Full time/permanent role) About Click Dealer At Click Deale...
More Details -
-
2 Days Old
Lead Site Reliability Engineer
-
Welwyn Garden City, England, United Kingdom
-
£100,000 - £125,000
- IT & Technology
Overview Lead Site Reliability Engineer - As the lead site reliability engineer, you will be responsible for setting standards for observability, building and maintaining tooling and automation to enhance the reliability of our platform, and worki...
More Details -
-
14 Days Old
Lead Site Reliability Engineer
-
London, England, United Kingdom
-
£150,000 - £200,000
- IT & Technology
Join to apply for the Lead Site Reliability Engineer role at Venquis. The role offers the opportunity to shape the reliability and performance of a platform watched by millions, balancing real-time user experience with operational excellence. The position is based in London and is a full-time role.
More Details -
-
15 Days Old
Lead Site Reliability Engineer
-
London, England, United Kingdom
-
£150,000 - £200,000
- IT & Technology
Job DescriptionMY client are transforming observability with a modern, full-stack platform. They are looking for a Lead SRE to own and elevate our Alerting & Incident Management platform. You’ll be the driving force behind reliability, customer satisfaction, and product excellence.
More Details -
-
26 Days Old
Lead Site Reliability Engineer
-
Slough, England, United Kingdom
-
£125,000 - £150,000
- IT & Technology
Social network you want to login/join with: Location: Remote working, with 1 day every 2 weeks at our Stoke-On-Trent office (5 mins from station). Salary: Competitive + company benefits (Full-time/permanent role) About Click Dealer: At Click Dealer...
More Details -