Jobs
Meine Anzeigen
Jobs per E-Mail
Anmelden
Stellenangebote Job Tipps Unternehmen
Suchen

Senior ai platform engineer

Berlin
eMFusion Global
Ingenieur
Inserat online seit: 17 März
Beschreibung

About the Role We are working with a leading international consultancy that is building scalable, production-grade AI SaaS products within their dedicated AI Lab. This is a greenfield opportunity — you will combine deep technical expertise with strategic vision to design and build AI-powered platforms that transform enterprise clients' business models. The AI Lab is developing cutting-edge, large-scale AI products delivering sustained commercial impact. The team operates with a startup mindset: agile, flat hierarchies, and a genuine bias for experimentation and ownership. The Opportunity This is a rare full-stack platform engineering role that spans infrastructure architecture through to LLM operationalisation. You will own the platform layer end-to-end — from Kubernetes cluster operations and IaC through to model serving, RAG pipelines, and LLMOps. Key themes of the role: Design and evolve a multi-tenant SaaS architecture with tenant isolation, per-tenant controls, and enterprise security Build automated tenant provisioning, safe rollouts (canary/feature flags), and noisy-neighbor protection Operationalise LLMs end-to-end — fine-tuning, evaluation, high-performance serving, monitoring, and embeddings workflows Drive MLOps foundations: automated training pipelines, experiment tracking, and scalable model deployment Manage Kubernetes clusters, GPU-heavy workloads, and autoscaling on AWS Build unified CI/CD pipelines shipping ML and application code seamlessly Implement comprehensive observability: logs, metrics, traces, model/data drift detection Embed enterprise security and compliance — IAM, RBAC, VPC design, secrets management, encryption — at every layer Design well-architected ETL/ELT pipelines, streaming systems, feature store integration, and workflow orchestration Technical Requirements Platform & Multi-Tenancy Proven patterns for tenant isolation (DB-per-tenant, schema-per-tenant, row-level security), tenant-aware caching, noisy-neighbor protection OIDC/OAuth2, tenant-aware RBAC/ABAC, SCIM provisioning, and audit logging for B2B SaaS Kubernetes & Infrastructure Deep Kubernetes: cluster ops, HPA/VPA, node pools, GPU scheduling, Karpenter, PDBs, network policies, multi-AZ design Service mesh (Istio/Linkerd), ingress patterns (ALB/Nginx), secure egress, mTLS Infrastructure as Code beyond basics: Terraform modules, Terragrunt, policy-as-code (OPA/Conftest), secrets automation GitOps (ArgoCD/Flux), progressive delivery (Argo Rollouts/Flagger), feature flags, canary and blue/green deployments MLOps & Model Lifecycle Model lifecycle tooling: MLflow/W&B, model registry, experiment tracking, reproducible training, dataset versioning (DVC/lakeFS) Pipeline orchestration: Airflow, Prefect, or Dagster artifact stores Model serving: KServe, Seldon, BentoML, or Ray Serve — online, async/batch inference, autoscaling, rollback LLMOps Prompt and version management, offline online evaluation harnesses, RAG evaluation (retrieval metrics, groundedness), guardrails, red-teaming basics Streaming inference (SSE/WebSockets), caching, routing, fallback models Vector DB experience: pgvector, Pinecone, Weaviate, or Milvus — embedding lifecycle, backfills, re-embedding, indexing strategies Observability & Security OpenTelemetry, tracing, SLOs — Prometheus/Grafana, Loki/ELK, Datadog/New Relic Incident management: postmortems, runbooks, error budgets GDPR, encryption at rest/in transit, secrets management (AWS Secrets Manager/Vault), KMS, key rotation SOC 2 / ISO 27001 familiarity, vulnerability scanning (Trivy/Grype), SBOMs, SAST/DAST About You You have shipped and operated customer-facing SaaS products at scale with real users You have owned end-to-end ML/AI infrastructure — from data ingestion through to production monitoring You enable engineers and data scientists to move faster through self-service platforms and automated workflows You have a track record of designing systems that scale globally across regions and traffic patterns You are comfortable with incident response, on-call rotations, and stabilising critical production systems You think with a product mindset — customer value, reliability, and speed-to-market over technology for its own sake You have a strong bias for automation and eliminating manual operational toil Excellent communication skills — async collaboration, documentation, and explaining technical decisions to non-technical audiences What's on Offer Genuine greenfield platform engineering ownership — build it from scratch Startup atmosphere with flat hierarchies within a globally established firm Hybrid working, international mobility across a wide office network Extensive learning and development programmes Competitive package including bonus

Bewerben
E-Mail Alert anlegen
Alert aktiviert
Speichern
Speichern
Ähnliches Angebot
Senior backend engineer - engagement
Berlin
N26 GmbH
Ingenieur
Ähnliches Angebot
Bauingenieur / architekt / techniker (alle geschlechter willkommen)
Berlin
FDS Gewerbebetriebsgesellschaft mbH
Bauingenieur
Ähnliches Angebot
Associate engineering manager
Berlin
N26 GmbH
Engineering Manager
Mehr Stellenangebote
Ähnliche Angebote
Ingenieur Jobs in Berlin
Jobs Berlin
Jobs Berlin (Kreis)
Jobs Berlin (Bundesland)
Home > Stellenangebote > Ingenieur Jobs > Ingenieur Jobs > Ingenieur Jobs in Berlin > Senior AI Platform Engineer

Jobijoba

  • Job-Ratgeber
  • Bewertungen Unternehmen

Stellenangebote finden

  • Stellenangebote nach Jobtitel
  • Stellenangebote nach Berufsfeld
  • Stellenangebote nach Firma
  • Stellenangebote nach Ort
  • Stellenangebote nach Stichworten

Kontakt / Partner

  • Kontakt
  • Veröffentlichen Sie Ihre Angebote auf Jobijoba

Impressum - Allgemeine Geschäftsbedingungen - Datenschutzerklärung - Meine Cookies verwalten - Barrierefreiheit: Nicht konform

© 2026 Jobijoba - Alle Rechte vorbehalten

Bewerben
E-Mail Alert anlegen
Alert aktiviert
Speichern
Speichern