Site Reliability Engineering Lead

New Today

## .**Site Reliability Engineering Lead****Programming Knowledge:** Java, .NET/C#, SQL, React (for integration with supported products).**Specialized Knowledge:** Databricks, FinOps cost management, disaster recovery planning.**Core Competencies:** Incident management, troubleshooting, IT service management frameworks, and GitOps/DevOps practices.* Solid understanding of Site Reliability Engineering (SRE) principles and practices.* Strong understanding of incident management, monitoring tools, IT service management frameworks and automation processes.* Previous experience in customer-facing roles or managing customer support escalations* Excellent technical problem-solving and troubleshooting abilities.* Strong communication and interpersonal skills, with the ability to collaborate across teams.* Leadership skills with a track record of mentoring and guiding technical teams* Strong collaboration and advanced communication skills at peer and senior management level.* Strong skills in setting, communicating, implementing, and achieving business objectives and goals through indirect leadership of and collaboration with others.* Strong organization/project planning, time management, and change management skills across multiple functional groups and departments, and strong delegation skills involving prioritizing and reprioritizing projects and managing projects of various size
and complexity.* Advanced problem-solving experience involving leading teams in identifying, researching, and coordinating the resources necessary to effectively troubleshoot/diagnose complex project issues; prior success extracting/translating findings into alternatives/solutions; and identifying risks/impacts and schedule adjustments to facilitate management decision-making.* Ability to manage multiple priorities and work effectively in a fast-paced environment.* Passion for continuous learning and staying up-to-date with industry trends and best practices.**Responsibilities:** -
The SRE and Platform/Cloud Engineering Lead will be accountable for the following areas* Support product development teams with infrastructure, non-functional requirements, and environment stability.* Manage Kubernetes deployments, Databricks environments, and other critical platforms.* Collaborate with cross-functional teams to deliver secure, reliable, and cost-effective platform and cloud solutions.* Ensuring all systems comply with security patching and vulnerability management tools.* In collaboration with architects, provide support for FinOps practices to monitor, optimize, and control cloud costs.* Provide clear direction, performance evaluations, and career growth for team members.* Ensure proper documentation, reporting, and compliance with security and regulatory standards.* Promote continuous learning, knowledge sharing, and operational excellence.* Writing and reviewing documentation for the management, improvement, and support of platforms/assets.* Completing complex bug fixes and root-cause investigations.* Working closely with development and platform teams to understand requirements and translate them into high-quality solutions.* Implementing infrastructure management and deployment best practices, including code/solution reviews.* Operating in various development environments (Agile, Waterfall, etc.) while collaborating with key stakeholders. #J-18808-Ljbffr
Location:
City Of London
Job Type:
FullTime

We found some similar jobs based on your search