Sr. Network Site Reliability Engineer (SREs)

  • Technopride Ltd
  • City, London
  • Dec 16, 2025
Full time I.T. & Communications

Job Description

Overview

We are seeking a highly experiencedSenior Network SREwith deep expertise across multi-vendor network infrastructure, automation, and reliability engineering. The ideal candidate will possess strong technical leadership, hands-on engineering capabilities, and a passion for building resilient, scalable, and observable network environments.

Key Responsibilities
  • Design, implement, and maintain highly available network solutions across routing, switching, firewalling, and wireless technologies.
  • Apply SRE principles to improve network reliability, scalability, and performance.
  • Develop and maintain automation workflows usingAnsible,Salt, and related frameworks to reduce operational toil.
  • Build and operate monitoring, alerting, and observability dashboards using tools such asGrafanaandSplunk.
  • Proactively identify network bottlenecks, performance issues, and reliability risks, implementing long-term fixes rather than reactive solutions.
  • Support incident response, root cause analysis, and post-incident reviews with a focus on continuous improvement.
  • Collaboration with cross-functional engineering, security, and operations teams to ensure network solutions meet business and technical requirements.
  • Contribute to documentation, runbooks, design artifacts, and operational standards.
  • Participate in capacity planning, network modernization initiatives, and automation-first strategies.
Required Skills & Experience
  • 10+ years of hands-on experiencein enterprise or service provider network engineering.
  • Expertise in multi-vendorrouting, switching, firewalling, and wirelesstechnologies.
  • Deep understanding of network protocols (BGP, OSPF, EIGRP, STP, VXLAN, VPNs, QoS, MPLS, etc.).
  • Strong experience with infrastructure automation usingAnsibleandSalt.
  • Proficiency with observability tooling such asGrafana,Splunk, or equivalent.
  • Solid understanding of SRE practices including SLIs, SLOs, error budgets, and proactive reliability engineering.
  • Strong troubleshooting, analytical, and performance optimization skills.
  • Excellent communication and collaboration skills, with the ability to influence and guide technical stakeholders.
Nice to Have
  • Experience with network programmability (Python, API-driven networking, NetConf/RESTConf).
  • Exposure to cloud networking (AWS, Azure, GCP).
  • Knowledge of zero-trust, SD-WAN, and network security best practices.
  • Experience creating self-healing or fully automated network workflows.