Machine Learning Ops Engineer
New Yesterday
Overview
Senior Machine Learning Operations Engineer London (2x a week onsite) - £500 p/d (Outside IR35) 6-months Contract
Responsibilities
- Evolve and scale the machine learning platform to support high-throughput model inference and fast iteration cycles.
- You will work closely with ML engineers and product teams to align infrastructure with evolving project needs, research and implement cutting-edge MLOps practices, and mentor colleagues by sharing expertise in cloud operations and ML engineering best practices.
- Manage GPU-powered Kubernetes clusters, improve automation pipelines, and ensure system reliability. Build and manage Kubernetes clusters from scratch, configuring them manually using tools like kubeadm, and deploy applications with Helm.
Qualifications
- MLOps & Kubernetes: GPU-enabled cluster management, built from scratch using kubeadm and Helm.
- Programming: Python or Go for ML automation workflows.
- Containerization: Docker and containerized application deployment.
- Cloud: AWS experience supporting ML workloads.
- CI/CD & Automation: ArgoCD, GitHub Actions, Infrastructure-as-Code (Terraform).
- Monitoring & Observability: Prometheus, Grafana, cloud-native stacks.
- ML Lifecycle: Production experience with experimentation, training, deployment, versioning, and monitoring.
- Reliability & Support: On-call participation, incident response, and system optimization.
Details
- Location: London (2x a week onsite)
- Day rate: £500 p/d (Outside IR35)
- Duration: 6-month
- Location:
- Little London, England, United Kingdom
- Salary:
- £125,000 - £150,000
- Job Type:
- FullTime
- Category:
- Engineering