Site Reliability Engineer

New Today

Site Reliability Engineer

Day Rate: £500 - £600

Location: Hybrid - 3-4 days on site per week Herefordshire (occasional travel to other UK sites)

Contract Position: Hybrid 3-4 days on-site - 4 days is preferable - 3-month rolling contract

Availability: On-call rota (24/7 when required)

Security Clearance: Security Clearance (SC) required - DV MOD Preferred - Must be eligible for DV Clearance

Start Date: ASAP

Overview

We're looking for a Site Reliability Engineer (SRE) to join our client's growing cross-domain services team, supporting critical systems used by major UK government organisations. As part of this dynamic environment, you'll play a key role in ensuring our platforms remain highly available, performant, and cost-efficient.

You’ll collaborate closely with software development, support, and operations teams to improve cloud and on-prem infrastructure, optimise CI/CD pipelines, enhance system observability, and proactively manage reliability risks across complex environments.

Key Responsibilities

Partner with Software Engineers to enhance system reliability, scalability, and performance.
Collaborate with System Administrators to automate repetitive tasks and streamline alerts.
Advance monitoring and observability practices to identify and resolve issues before they affect users.
Support development and testing environments to help meet delivery and quality objectives.
Research, evaluate, and recommend tools and technologies to improve operational efficiency.
Develop a deep understanding of the technical ecosystem, contributing to both cloud and on-prem solutions.

Essential Skills & Experience

Strong background with configuration management tools (e.g. Ansible, Chef, Puppet).
Hands‑on experience with Terraform for infrastructure as code.
Expertise with containerisation and orchestration (Docker, Kubernetes, OpenShift, or Swarm).
Skilled in CI/CD pipeline tools (e.g. Jenkins, GitLab CI).
Proficient with monitoring and observability tools (Grafana, Prometheus, InfluxDB).
Experience integrating event‑driven systems using MQ solutions (RabbitMQ or similar).
Strong knowledge of SQL and relational databases.
Advanced Linux administration and shell scripting skills.
Familiarity with network security protocols.
Experience deploying and maintaining systems on AWS (EC2, RDS, S3, Lambda).

Desirable Skills

Programming experience in Java, Go, or Python.
Understanding of cross‑domain technologies and security models.
Background in service management environments and ITIL practices.
Proven application of observability patterns and system health metrics.
Experience with Microsoft Azure cloud services.

For more information, send your CV to Ryan at rmitchell@itecopeople.co.uk

#J-18808-Ljbffr

Apply

Location:: England, United Kingdom
Salary:: £100,000 - £125,000
Job Type:: FullTime
Category:: Engineering