About your job
* As a Site Reliability Engineer you apply software engineering principles to run large-scale systems.
* You are responsible for the reliability and performance of large-scale, distributed systems running in the cloud. You set up the test and production environments together with our engineering teams using infrastructure as code. You automate manual tasks and monitor your clusters on your own responsibility.
* You are able to do administrative tasks, but you automate any repeating tasks.
* You’re interested in automation, keeping things simple, and supporting engineering teams to continuously ship great software.
* The job profile is new and a great opportunity for system administrators or software engineers interested in running large cloud systems.
* This job requires software engineering, programming and systems administration skills.
* We have a flexible Hybrid Work model where most people work 2-3 days per week from home.
About you
* Bachelor’s degree in Computer Science or related fields like Mathematics, Physics or Electrical Engineering
* Experience with Docker and Kubernetes
* Experience in UNIX systems administration
* Programming experience – ideally in Go
Bonus skills and interests:
* Master or PhD in Computer Science or related fields like Mathematics or Physics
* Experience in Google Cloud Platform (GCP), Amazon Web Services (AWS) or Microsoft Azure
* Experience with observability (logging, monitoring, alerting, tracing)
* Experience with networking
* Experience with security
* Experience working with distributed systems in the Cloud (Terraform)
* Experience in setting up CI/CD pipelines