About Me
With over a decade in the technology sector, I've built a solid career in DevOps, Site Reliability Engineering (SRE), Platform Engineering, and Production Operations. I focus on leading engineering teams and managing complex projects, including major cloud migrations.
My work involves designing and improving digital infrastructures for resilience. I believe that effective technology leadership means creating environments where teams can innovate, supported by smooth and stable systems and easy to understand processes. It's about setting up the right foundation so we can build and iterate quickly.
As a technologist, I have a particular passion and interest in Artificial Intelligence (AI). I'm convinced that when used thoughtfully, AI can be a powerful tool for good, simplifying complex tasks and making technology more intuitive and helpful in our daily lives.
Core Expertise
Experience
Sr Manager, Software Development Engineering - Platform Engineering
May 2024 - PresentWorkday
Leading platform engineering initiatives for 18 service teams, across 3 timezones, while balancing company-wide strategic initiatives and customer experience improvements.
- Managed platform engineering backlog for 18 service teams and applications while balancing company-wide strategic initiatives
- Built global relationships across service teams and business leaders through regular formal and informal touchpoints to anticipate and prepare for emerging requirements
- Prioritise work using impact and customer experience metrics to balance operational maintenance with strategic projects
- Spearheaded automation initiative that reduced regional build-out time from 3 months to 5 days by automating creation of 9,000+ lines of code
- Leveraged automation tools to accelerate deployment processes and reduce manual engineering effort
Manager, Software Development Engineering - DevOps/SRE
Jan 2022 - May 2024Workday
Orchestrated platform migration from managed AWS services to Kubernetes while managing team transformation and maintaining service availability.
- Directed platform migration from managed AWS services to Kubernetes while maintaining service availability
- Oversaw the successful integration of managed services into our Kubernetes environment by leading the team responsible for building custom operators
- Established dedicated global business hours support role to handle reactive work and proper escalation paths for on-call engineers
- Allocated dedicated sprint time for continuous learning and skill development during platform modernization
- Maintained existing pipelines for smooth transition while managing regional and federal buildout projects
SRE Manager
Apr 2019 - Jan 2022Hostelworld Group
Championed Site Reliability Engineering initiatives focusing on system reliability, incident response, and managed migration from on-prem data centre to Google Cloud Platform .
- Managed distributed Site Reliability Engineering team across 3 European locations
- Implemented comprehensive reliability metrics for latency, traffic, and errors with iterative SLO improvements based on system maturity
- Established incident response capabilities through game days and internal training programs
- Facilitated root cause analysis meetings with all stakeholders and operations teams to prevent future failures
SRE Tooling and Automation Team Lead
Sep 2018 - Apr 2019Hostelworld Group
Built a distributed tooling team across 3 European locations, designing and implementing automation tools and CI/CD pipelines.
- Designed trend-based alerting systems using SLIs to detect reliability issues before impact
- Built CI/CD pipelines with Jenkins, reducing lead time from four weeks to 3 days
- Increased deployment frequency from once a month, to several times a day
Test Automation Lead
Apr 2017 - Sep 2018Hostelworld Group
Developed comprehensive test strategies and automated testing frameworks to ensure system reliability and quality.
- Developed test strategies focused on critical system components with 10% test coverage in core functional areas
- Implementated automated pipeline coverage
- Evolved foundational automated testing framework that evolved into comprehensive reliability practices
Senior Test Automation Engineer
Aug 2015 - Apr 2017Hostelworld Group
Advanced test automation engineering role focusing on building robust testing infrastructure and processes.
- Completed complete regression testing B2C and B2B websites
- Built foundational API regression testing framework
Test Automation Engineer
Mar 2014 - Aug 2015Hostelworld Group
Entry into test automation engineering, building foundational skills in automated testing and quality assurance.
- Built foundational automated Selenium testing framework
- Collaborated with developers coverage and reliability
Barman/Bar Manager
2005 - 2014Management experience developed operations skills
Gained foundational management experience and developed operations skills:
- Coordinated staff scheduling, inventory, and deliveries while maintaining quality standards and minimizing disruptions
- Identified and resolved operational issues quickly during high-pressure periods to ensure consistent performance
- Established procedures, tracked performance metrics, and optimized processes based on operational data
Featured Projects
AI-Powered Infrastructure Optimization
Developed machine learning models to predict infrastructure scaling needs, reducing costs by 30% while maintaining performance SLAs.
Multi-Cloud Platform Migration
Orchestrated the migration of critical services across multiple cloud providers, implementing zero-downtime deployment strategies.
Developer Experience Platform
Built internal platform tools that reduced developer onboarding time from 2 weeks to 2 days, improving team productivity significantly.
Thoughts & Insights
Occasionally, I share thoughts on technology leadership, AI ethics, and the future of platform engineering. Here are some topics I'm currently exploring:
AI in Infrastructure Management
Exploring how artificial intelligence can revolutionize infrastructure monitoring and predictive maintenance.
Building Resilient Teams
Strategies for creating engineering teams that thrive under pressure while maintaining work-life balance.
The Future of Platform Engineering
How platform engineering is evolving and what it means for developer productivity and system reliability.
Give me a shout!
Interested in discussing technology, AI, or potential collaborations? I'm always open to meaningful conversations about the future of engineering.