The Boots Cloud Centre of Excellence team is seeking an experienced Cloud Site Reliability Engineering (SRE) Lead to join our growing International IT and Advanced Analytics (IT2A) function. This role has the flexibility to work from either our London, Weybridge or Nottingham offices on a flexible office/home hybrid approach.
We are looking for people who want to challenge the ‘norm’; we want to change how we use the Cloud and help our companies who have some awesome ideas on how we help our patients and customers aand the Cloud Centre of Excellence is pivotal to help our colleagues across IT and the wider business to make the most of the Cloud.
So, if you’re a great individual with the desire to make things happen and want to join a team of equally great people, please get in touch.
About the role:
As our new Cloud Site Reliability Engineering (SRE) Lead you will lead a team who will engineer and enable enterprise Cloud DevOps/Platforms enabling teams to increase productivity and reduce time to deliver solutions.
You will design and implement development collaboration, code, infrastructure, source control, security, compliance, continuous integration, testing, delivery, monitoring, and feedback.
- Operate, monitor, and maintain high availability of WBA applications running in Azure Cloud environment
- Engage in triaging of the Cloud platform major incidents and propose stabilization approaches based on problem management approach, to prevent future failures.
- Lead the design, write and deliver software and automation to dramatically improve the availability, scalability, latency, and efficiency of cloud services
- Continually improve cloud operations automation and tooling to monitor and maintain enterprise cloud-based applications
- Troubleshoot infrastructure and application issues, and work with the relevant teams (Public cloud vendors, application teams, Cloud Engineering) to resolve issues
- Identify and improve on possible points of failure in the infrastructure/applications, by working with engineering, business and operations technical teams, to establish business and technical monitoring strategies.
- Facilitate blame-free root cause analysis meetings in the event of a production-systems incident so that the team can learn from mistakes and improve our systems and run books
What you’ll need to have:
- Knowledge of Azure technical capabilities, resource provisioning and resource capacity management.
- Experience in software engineering, site reliability engineering (SRE) model and DevOps paradigms
- Knowledge of Azure Security, including understanding how security works, NSG roles, data encryption
- Experience with DevOps tools, JIRA, Remedy, and MS AZURE based monitoring tools.
- Experience in performance testing tools and SRE best practices
- Preferred experience with Azure Boards and ServiceNow (we’re moving to these platforms in-lieu of Jira and Remedy)
- Experience with MS Azure tech stack e.g. ADO, Event Hub, Event Grid, Xplat CLI, kubectl, POWER BI, Log Analytics, Kusto, AKS, COSMOS DB, APIM
- Experience with monitoring tools, i.e. DynaTrace, App Insights, Tivoli
- Experience with Azure DevOps, CI / CD pipeline tools
- Experience with system high-availability engineering, Disaster Recovery will be a plus
- Experience with other cloud technology platform (e.g. AWS, Google Cloud) are welcomed to apply
- Proficiency in statistical analysis and machine learning tools used for automation of tickets.
It would be great if you also had:
- 5 years of leading a Cloud SRE team or 3 years as a SRE engineer
About our team – Cloud Centre of Excellence
WBA is the world’s leading pharmacy-led health and beauty retailer. With over 2,500 stores in the UK, approaching 10,000 stores in the US and thousands more internationally, our purpose is to help our customers look and feel better than they ever thought possible.
WBA is continuing the journey we started a couple of years ago with the Cloud. We are now embarking on establishing our International (Boots UK, No 7 Beauty Company, Boots Opticians and International Retail) Cloud Centre of Excellence which will support and develop the services we look after today.
You will have the opportunity to help set up the Cloud Centre of Excellence from the start, so you can influence how we set ourselves up for success. This is an exciting time for us as we look to build a super team of brilliant people who can help our companies maximise the opportunities the Cloud brings to organisations. From setting up the FinOps function, through to working in Architecture to develop our strategies, to developing our Engineering capability around automation and AI, to all the way to supporting the services we have and making these even better.
Generous staff discount including enhanced discounts on Boots brands and Boots Opticians
Excellent onsite facilities including staff shop, opticians (including free eye tests for team members), gym, cafeteria, outdoor seating spaces and dry cleaning service
Travel links including on campus bus stops, parking and close to train and tram links
We have a great range of benefits in addition to the above that go beyond salary and offer flexibility to suit you; Click here to view our full list of company benefits (all rewards and benefits are subject to change and eligibility).
If your application is successful, our recruitment team will be in touch to arrange an interview and answer any initial questions you have. If you have not been successful on this occasion you will be notified by email.
We are always open to discussing possible flexible working options and what this may look like for you, including job share and reduced hours. If you require additional support as part of the application and interview process, we are happy to provide reasonable adjustments to enable you to be at your best.
You might also be interested in
Got a question?
Check out our FAQs on searching for jobs, applications, interviews and other commonly asked questions