← Back to All Jobs
Expert Professors - Coding & STEM
RemoteContractor$70-$95 per hour (USD)
Join a leading AI lab's GenAI team to contribute to frontier-model evaluation. The position focuses on designing benchmark tasks for coding and agentic workflows, with responsibility for creating challenges that expose reasoning gaps in advanced language models.
This is a W2 employment position with Cincinnatus LLC.
Location: Remote (US-based)
Compensation: $70-$95/hour
Employment Type: Part-time
Hours Required: Minimum 30 hours weekly on weekdays (6+ hours daily)
Core Responsibilities:
- Task Design and Development: Create challenging, real-world domain-specific problems targeting capability failures in frontier AI models.
- Specification & Solution Development: Integrate the problems into an Agentic development environment, preparing all necessary components using Python, including detailed instructions and working solutions.
- Performance Evaluation: Assess model performance across tasks and identify logical reasoning failures.
- Analysis: Analyze the agent's steps (Agent Trajectory) to observe and extract core capability loss patterns.
Required Qualifications:
- Current or retired STEM professor (ML, coding, data science fields)
- Degree in computer science, data science, or related STEM discipline
- Reliable weekday availability (30+ hours weekly)
- Independent work capability and time management skills
- Strong communication and problem-solving abilities
Preferred Experience:
- AI training background
- Model evaluation expertise
- Data annotation experience