Are you an expert at navigating the complex architecture of Large Language Models? Welo Data is seeking a highly technical Senior Prompt Engineer based in Germany to lead the end-to-end migration of template workflows into high-performance LLM autoraters.
This is a specialized role for a technical architect who understands that "perfecting a prompt" is a rigorous engineering discipline. You will leverage advanced APG/APO tools and manual refinement to ensure our automated systems meet—and exceed—human accuracy baselines in both German and English contexts.
The Mission: Automated Quality at Scale
* Architectural Migration: Take full ownership of the end-to-end technical migration of templates to LLM autoraters.
* Optimization Leadership: Utilize Automatic Prompt Generation (APG) and supervise Automated Prompt Optimization (APO) tools to push model performance past plateaus and logic deadlocks.
* Metrics-Driven Excellence: Continuously measure quality against "gold data" baselines, tracking precision, recall, and F1 scores to justify launch readiness.
* Edge-Case Engineering: Manually draft and refine complex prompts to overcome anti-patterns and architecture gaps that automated tools cannot solve.
Project Details
* Schedule: Part-Time (Set your own hours within project milestones).
* Location: 100% Remote ( Must be currently based in Germany ).
* Language: Native fluency in German and professional fluency in English .
* Employment Type: Freelance / Independent Contractor.
Candidate Profile
* Educational Foundation: Bachelor’s, Master’s, or PhD in Computer Science, Data Science, Computational Linguistics, or a related analytical field.
* Prompt Engineering Mastery: 4+ years of experience tuning LLMs for strict, structured outputs, complex classification, and few-shot learning.
* Analytical Power: High proficiency in identifying error patterns and using SQL or data analytics tools to monitor performance.
* Technical Agility: Fast learner capable of mastering proprietary internal tools and "Goose API" style interfaces with minimal oversight.
Preferred Technical Skills
* Familiarity with shadowbot monitoring and disagreement tracking between human and LLM ratings.
* Hands-on experience with Chain-of-Thought (CoT) prompting and APO systems.
* Deep linguistic expertise, including a strong understanding of semantics and formal logic.
* Proven ability to draft high-level Launch Certification Documentation .