← Back to All Jobs

Expert Professors - Coding & STEM

RemoteContractor$70-$95 per hour (USD)

Join a leading AI lab's GenAI team to contribute to frontier-model evaluation. The position focuses on designing benchmark tasks for coding and agentic workflows, with responsibility for creating challenges that expose reasoning gaps in advanced language models.

This is a W2 employment position with Cincinnatus LLC.

Location: Remote (US-based)

Compensation: $70-$95/hour

Employment Type: Part-time

Hours Required: Minimum 30 hours weekly on weekdays (6+ hours daily)

Core Responsibilities:

  • Task Design and Development: Create challenging, real-world domain-specific problems targeting capability failures in frontier AI models.
  • Specification & Solution Development: Integrate the problems into an Agentic development environment, preparing all necessary components using Python, including detailed instructions and working solutions.
  • Performance Evaluation: Assess model performance across tasks and identify logical reasoning failures.
  • Analysis: Analyze the agent's steps (Agent Trajectory) to observe and extract core capability loss patterns.

Required Qualifications:

  • Current or retired STEM professor (ML, coding, data science fields)
  • Degree in computer science, data science, or related STEM discipline
  • Reliable weekday availability (30+ hours weekly)
  • Independent work capability and time management skills
  • Strong communication and problem-solving abilities

Preferred Experience:

  • AI training background
  • Model evaluation expertise
  • Data annotation experience