Job Description
We are on a lookout for a hands-on Senior Data Scientist-Generative AI & Agents to join the Vendor Data Team on our journey to always deliver amazing experiences and go beyond prompt engineering into autonomous orchestration: designing agents that generate their own prompts, tools that empower AI with real-world actions, and judge models that validate outputs. Your work won’t sit in research notebooks — it’ll ship.
The Agentic AI team is building the next generation of AI-native products — intelligent systems that reason, act, and adapt. We combine the power of large language models (LLMs), autonomous agents, and retrieval-augmented generation (RAG) to move beyond static prompts and chatbots into dynamic AI systems that solve real problems at scale. We don’t treat LLMs as tools to micromanage — we treat them as collaborators. Our products are designed to build and evolve themselves, leveraging AI to orchestrate workflows, make decisions, and continuously improve. If you’re ready to stop building brittle bots and start building AI that builds AI — this is your playground.
1. Architect and deploy agent-based systems using LangChain, CrewAI, AutoGen, or custom frameworks
2. Build and optimize RAG pipelines — from smart chunking to embeddings and vector search - Fine-tune and evaluate LLMs for reliability, relevance, and factuality
3. Design evolving prompt and memory systems driven by LLMs and user feedback
4. Implement simulation and feedback loops using judge models or agent testing harnesses
5. Partner cross-functionally with product, research, and engineering teams
6. Stay current with the latest research in LLMs, generative modeling, and agent systems
Qualifications
7. Machine learning fundamentals (supervised, unsupervised, reinforcement learning) - Data wrangling and preprocessing (Pandas, NumPy, etc.)
8. LLMs and NLP expertise (Transformers, BERT, GPT, etc.
9. Prompt engineering & fine-tuning
10. Knowledge of tokenization, embeddings, and attention mechanism
11. Using and fine-tuning Hugging Face models
12. Evaluation metrics for generative models (BLEU, ROUGE, faithfulness, hallucination detection)
Nice to have:
13. Open-source model experience (LLaMA3, Mistral, Mixtral) or custom inference stacks - Judge models, feedback loops, or synthetic user testing architectures
14. Familiarity with prompt injection defenses, output validation, or agent safety
15. Experience deploying generative AI in web or mobile products
16. Academic or industry background in statistics, cognitive science, or human-AI interaction