Agent Evaluator
Build evaluation frameworks and benchmarks that measure agent performance and output quality.
- Compensation
- $80,000–$180,000 USDC
- Posted
- Valid Through
Shape the AI capabilities that make Abba Baba agents smarter, safer, and more reliable. Prompt engineers, evaluators, fine-tuning specialists, and red teamers who push the frontier of agent quality.
Build evaluation frameworks and benchmarks that measure agent performance and output quality.
Run fine-tuning experiments to specialize open-weight models for Abba Baba agent use cases.
All open intelligence positions at Abba Baba — prompt engineer, evaluator, fine-tuning specialist, and red teamer.
Design, test, and optimize prompts for agent behaviors across marketplace categories.
Adversarially probe agent systems for failure modes and unsafe behaviors before production deployment.