Observability Platform Engineer (SRE Focus)
29 Days Old
We're building a world-class Observability function, and we're looking for someone who lives for uptime, meaningful alerts, and elegant dashboards. If you've ever been on-call, silenced a noisy monitor, or traced a ghost bug across microservices outside core hour - we want to hear from you!
This isn't a generic "Platform Engineer" role. You'll be laser-focused on observability, reliability, and developer empowerment, working closely with teams to make sure we don't just know when things break - but why.
Requirements
What You'll Be Doing
- Designing and scaling on-call systems that engineers don't dread being part of.
- Building out Datadog monitoring, alerting, dashboards, and log pipelines for our Kubernetes-based environments.
- Defining and managing SLOs, SLIs, and error budgets - and helping teams stick to them.
- Creating scorecards and software catalogs so engineers know what's healthy, what's broken, and who owns what.
- Training and enabling dev teams to own their own observability, alerts, and incident response.
- Introducing chaos engineering practices (yes, we want to break things... on purpose).
- Driving a culture of reliability, with incident reviews, shared learnings, and transparency.
- Have production experience with observability tools (especially Datadog) in cloud-native environments.
- Have set up monitoring and alerting across Kubernetes services.
- Have built or scaled on-call systems in startups or large-scale environments.
- Know how to reduce alert fatigue and love a good MTTR chart.
- Have experience with infrastructure as code (Terraform preferred).
- Believe that great developer experience includes clear visibility and ownership.
- Are curious about - or already practicing - chaos engineering.
- Experience with OpenTelemetry, Fluent Bit, or similar.
- Familiarity with service catalog tooling (e.g., Backstage).
- Comfortable running or facilitating game days or failure drills.
- Prior involvement in setting up scorecards for service health.
- This is not a traditional platform or infra role.
- You won't be spending your days tweaking CI/CD pipelines or setting up VPCs.
- We're looking for someone obsessed with how systems behave in production - not just how they're deployed.
- Cloud: AWS (EKS, Lambda, etc.)
- Observability: Datadog, OpenTelemetry
- Infra as Code: Terraform
- Orchestration: Kubernetes (EKS)
- Logging: Fluent Bit, FireLens
- Catalogs/Scorecards: Backstage (or custom)
If this sounds like your kind of role, we'd love to hear from you.
Drop us a message with your CV and a note about the coolest monitoring setup or incident resolution you've ever worked on.
Benefits
Why join YouLend?
- Award-Winning Workplace: YouLend has been recognised as one of the "Best Places to Work 2024 & 2025" by the Sunday Times for being a supportive, diverse, and rewarding workplace.
- Award-Winning Fintech: YouLend has been recognised as a "Top 250 Fintech Worldwide" company by CNBC.
- Stock Options
- Private Medical insurance via Vitality
- EAP with Health Assured
- Enhanced Maternity and Paternity Leave
- Modern and sophisticated office space in Central London
- Free Gym in office building in Holborn
- Subsidised Lunch via Feedr
- Deliveroo Allowance if working late in office
- Monthly in office Masseuse
- Team and Company Socials
- Football Power League / Squash Club
- Location:
- London, England, United Kingdom
- Salary:
- £125,000 - £150,000
- Job Type:
- FullTime
- Category:
- IT & Technology
We found some similar jobs based on your search
-
New Yesterday
Observability Platform Engineer (SRE Focus)
-
London, England, United Kingdom
-
£125,000 - £150,000
- IT & Technology
We're building a world-class Observability function. We're looking for someone who lives for uptime, meaningful alerts, and elegant dashboards. This isn't a generic Platform Engineer role. You'll be laser-focused on observability, reliability, and developer empowerment.
More Details -
-
14 Days Old
Observability Platform Engineer (SRE Focus)
-
London, England, United Kingdom
-
£125,000 - £150,000
- IT & Technology
YouLend is building a world-class Observability function. We’re looking for someone who lives for uptime, meaningful alerts, and elegant dashboards. If you’ve ever been on-call, silenced a noisy monitor, or traced a ghost bug across microservices - we want to hear from you!
More Details -
-
29 Days Old
Observability Platform Engineer (SRE Focus)
-
London, England, United Kingdom
-
£125,000 - £150,000
- IT & Technology
We're building a world-class Observability function. We're looking for someone who lives for uptime, meaningful alerts, and elegant dashboards. If you've ever been on-call, silenced a noisy monitor, or traced a ghost bug across microservices - we want to hear from you.
More Details -