Job Description
We are on the lookout for an Engineer II, Golang (SRE) to join the Tech Foundations vertical on our journey to always deliver amazing experiences.
As a member of the Tech Foundation Team, you’ll support Delivery Hero’s rapid innovation and growth. You’ll work on foundational systems that enable faster feature development, security, and reliability across our global engineering community. Every enhancement you make will contribute to our teams' ability to build, scale, and deliver quality features—ultimately impacting millions of users worldwide.
You will join our Site Reliability Engineering (SRE) organization, and work on Incident Detection and Response to ensure the reliability and stability of our systems by building innovative tools and platforms that empower our engineering organization to effectively manage critical operational events and prevent alert fatigue. Your work will directly contribute to a healthier on-call experience, faster resolution times for critical issues, and enhanced overall system stability across Delivery Hero.
In this team you will:
1. Design and develop robust, scalable, and high-performance software solutions for critical SRE tooling using Go (Golang).
2. Collaborate with product managers, designers, and other engineers to understand requirements, define technical specifications, and deliver impactful features.
3. Maintain coding standards and actively participate in code reviews, ensuring a high-quality, maintainable, and efficient codebase.
4. Implement automated testing frameworks to proactively catch bugs and ensure software reliability.
5. Identify, troubleshoot, and resolve complex technical issues in both software and infrastructure, continuously seeking opportunities to optimize performance and scalability.
6. Work closely with engineering teams across Delivery Hero to ensure alignment on project goals, timelines, and implementation strategies.
Qualifications
7. Proven hands-on experience working on high scale projects.
8. Strong proficiency in GoLang, with the ability to write clean, efficient, and maintainable code.
9. Ability to understand and reason through large existing projects and workable knowledge of Python.
10. Experience with cloud services (AWS, GCP) and understanding how to leverage them for scalable and resilient solutions.
11. Familiarity with containerization technologies (Docker) and orchestration systems (Kubernetes), facilitating the deployment and management of applications.
12. Strong communication skills with the ability to work effectively in a team environment and collaborate with stakeholders across different disciplines.
Nice to Have
13. Proficiency in using observability tools such as Grafana and Prometheus to monitor application performance and troubleshoot issues efficiently.
14. Experience contributing to SRE-focused tooling or platforms.
15. Understanding of incident management best practices or related processes.
16. Previous experience in on-call rotations.