AI Evaluation Specialist
Job summary
We are looking for a sharp detailed senior to architect the systems that measure and improve our generative AI models. You will work at the intersection of data science, product, and research to ensure our AI systems are not only accurate but also safe, unbiased, and aligned with human preferences.
Job descriptions & requirements
Responsibilities:
- Design and implement robust automated evaluation frameworks (using Python) test LLMs for tasks like reasoning, coding, and summarization.
- Lead the development of annotation rubrics and manage workflows for human evaluators to generate high context preference data and golden datasets.
- Design and execute adversarial testing (re-teaming) to identify vulnerable, hallucinations, and biases in mode outputs before deployment.
- Develop and calibrate reliable LLM-based evaluators to replace human raters at scale for specific metrics, validating their correlation with human judgment.
- Analyze evaluation results to pinpoint specific model weaknesses (e.g. model fails at multi- step reasoning in finance contexts) and present actionable insights to modeling and product teams.
- Build and maintain internal evaluation in the platform and dashboards to track model performance across different versions and use cases.
Requirements:
- 4+ years of experience in machine learning, Data science, or AI Evaluation
- A degree in Computer Science, Information Technology, Data Science, or a related field.
- Proven track record of designing evaluation strategies for NLP or Generative AI products.
- Expert-level proficiency in Python for scripting evaluations and analyzing results (pandas, NumPy).
- Strong ability to query data (SQL) and perform statistical analysis to validate evaluation confidence intervals and inter-annotator agreement.
- Advanced ability to craft prompts for crafts prompts for both model testing and steering LLM-based evaluators.
Remuneration: NGN 500,000
Important safety tips
- Do not make any payment without confirming with the Jobberman Customer Support Team.
- If you think this advert is not genuine, please report it via the Report Job link below.