Staff/senior aws cloud platform engineer (d/f/m)

Berlin

emnify

Ingenieur

Inserat online seit: 14 April

Beschreibung

Your Role:

Are you passionate about observability and resiliency? Is ensuring we know about issues before our customers second nature to you? Is being at the front and orchestrating processes sounds fun to you? emnify is seeking a talented Reliability Engineer & Incident Management Operator to drive the company Incident Management routines, be the authority for everything observability and resiliency, and guide internal stakeholders with best practices.

As a part of the larger Engineering department, our Platform team plays a crucial role in enhancing our competitive edge by improving developer experience to increase development efficiency and scale productivity. You will join a team of 3 engineers, fostering empathy and a collaboration mindset to ensure continuous improvement of development experience at emnify. The ideal candidate will have extensive experience with AWS cloud infrastructure, microservices, and modern observability practices as well as strong communication and organizational skills.

The position is 35% Incident management operations, 35% Observability and monitoring work, and 30% platform engineering and developer support.

The position is based in emnify’s office in Berlin.

Your Impact:

* Incident management operations:

Lead and optimize the incident management process end-to-end, ensuring timely detection, resolution, and documentation of incidents; coordinating cross-functional teams, conducting post-mortems and root cause analyses, and driving continuous improvements to workflows.

* Observability and monitoring:

Design, implement, and continuously improve observability frameworks by developing dashboards, alerts, metrics, and logging strategies to monitor service health, detect anomalies proactively, support issue resolution, and ensure cost-optimized performance across the platform.

* Collaboration and Support:

Partner with cross-functional teams to implement observability best practices, providing training and guidance on tools while leveraging metrics data to drive engineering priorities.

* Platform engineering:

Leverage AWS to design, build, and maintain a resilient cloud infrastructure, implementing best practices for security, scalability, and cost optimization while ensuring high availability, disaster recovery, and robust platform components such as pipelines, shared infrastructure, and application services.

Your Skills:

• Proven experience as a (Site) Reliability Engineer or similar role in a SaaS and/or telecom company.

• Hands-on experience with observability tools (e.g., Prometheus, Mimir, Grafana, Loki, CloudWatch, Grafana IRM, Rootly), including setup and optimization of metrics and alerts.

• Experience in establishing and managing incident management processes.

• Understanding of incident management frameworks and best practices.

• Extensive experience with AWS cloud services (e.g., EC2, S3, RDS, Lambda, CloudWatch).

• Expert skills with modern infrastructure tooling and principles (Kubernetes, IaaC - Terraform, CI/CD - GitHub Actions, Jenkins)

• Good understanding of modern development tooling and principles (e.g., microservices architecture, 12-factor applications, Docker)

• Advanced documentation skills for effective knowledge sharing and collaboration.

• Exceptional problem-solving and critical thinking with a passion for enhancing development experiences in fast-paced tech environments.

• Ability to work independently and as part of a team.

Nice to have:

• Knowledge of networking protocols and telecom systems

• Knowledge of secure software development

• Familiarity with programming languages such as Python, Go, or Java.

• Certification in AWS (e.g., AWS Certified DevOps Engineer, AWS Certified Solutions Architect)

About emnify

emnify is a global IoT connectivity platform provider, enabling enterprises and OEMs to build, deploy, and operate connected products at scale across 180+ countries. Our cloud-native SuperNetwork powers mission-critical use cases in industries such as mobility, logistics, energy, and industrial IoT.
We are a Series-B scale-up with a strong growth trajectory and a clear focus on upmarket and enterprise customers. emnify’s technology and innovation have been recognized repeatedly by the industry, including:

* eSIM Provider of the Year at the MVNOs World Awards 2025
* Platinum Award for eSIM Management Platform Innovation by Juniper Research (2025)
* IoT Innovator Award by Compass Intelligence for aviation connectivity
* Deloitte Technology Fast 50 Germany, recognizing emnify as one of the country’s fastest-growing tech companies

As we move into the next phase of growth, AI plays a central role in our strategy—both in how we build our platform and in how our customers interact with it.

Bewerben

E-Mail Alert anlegen

Speichern

Ähnliches Angebot

Planungsingenieur:in spezialist kib

Berlin

Deutsche Bahn AG

Planungsingenieur

Ähnliches Angebot

Senior-projektbetreuer/in (ingenieur/in) flußrenaturierung

Berlin

NABU (Naturschutzbund Deutschland) e. V.

Ingenieur

Ähnliches Angebot

Senior software test engineer e-health (m/w/d)

Berlin

CompuGroup Medical SE & Co. KGaA

Ingenieur