Jobbeschreibung
Senior MLOps Engineer (ML Workflows Engineering)
As part of our team, you will:
* Build tools, automation, and workflows to simplify infrastructure-heavy tasks, empowering AI teams to focus on experimentation and solving core challenges.
* Develop robust monitoring, logging, and tracing systems to ensure the performance and reproducibility of ML workflows in production.
* Design, implement, and maintain end-to-end machine learning pipelines to enable the seamless development, training, and deployment of ML models and intelligent agents.
* Work with large-scale distributed systems, including GPU clusters, to support training, fine-tuning, and evaluation of ML models.
* Collaborate with product and development teams to transform high-level goals into concrete, scalable, and maintainable systems.
* Optimize workflows for reproducibility, scalability, and cost-efficiency while keeping ML teams productive and focused on innovation.
Voraussetzungen
We’ll be happy to have you on our team if you have:
* Hands-on experience with modern MLOps tooling, including Kubernetes, Cloud providers (GCP and AWS), and ML orchestration frameworks.
* A solid understanding of the ML lifecycle from idea to the customer-facing application.
* The ability to own projects end to end, starting from a high-level problem or product pain point and overseeing it through the design, experimentation, implementation, and iteration phases.
* A customer-centric mindset – you care about how ML engineers are actually working and can translate their needs into actionable, scalable, and maintainable architectural decisions.
* Experience with modern CI/CD systems, like GitHub Actions or JetBrains TeamCity.
* At least three years of Python experience writing clean, maintainable code in modern ML codebases.
Our ideal candidate would have experience with:
* ML orchestrators and workflow tools such as ZenML, Dagster, and Airflow.
* Developing infrastructure components and services using cluster solutions like Kubernetes.
* The development of Python-based backend services.
* Creating and maintaining ML pipelines, including legacy ones.
* Experiment tracking and observability using tools like Weights & Biases, MLflow, Langfuse, or similar.
We’d be especially thrilled if you have experience with:
* LLM inference frameworks such as vLLM, DeepSpeed, and TensorRT.
* Writing and maintaining Python libraries used by internal (or external) ML engineers.
* A strong theoretical background in NLP and transformer-based approaches.
* Writing code in Java and/or Kotlin.
Wir bieten Ihnen