// ROLE SUMMARY
You will evaluate code generated by AI models. Each task shows you a programming prompt and one or more candidate solutions.
Software Output Evaluator
// DESCRIPTION
You will evaluate code generated by AI models. Each task shows you a programming prompt and one or more candidate solutions. Your job is to assess correctness, efficiency, readability, and adherence to best practices, then rank the solutions and write a brief justification. Languages vary by project but commonly include Python, JavaScript/TypeScript, Java, C++, and SQL. Some tasks also ask you to identify bugs, suggest fixes, or rate the quality of inline comments and documentation.
We look for developers who care about code quality -- not just whether it runs, but whether it is maintainable, efficient, and clear. Experience with code review tools (GitHub PRs, Gerrit, Crucible) is a plus. Strong knowledge of software testing principles helps because some evaluations require you to reason about edge cases and test coverage.
After a one-hour onboarding session covering the evaluation rubric and annotation tool, you start with a calibration set of 10 tasks. Once you pass calibration, live tasks are available immediately. Turnaround expectations are per-batch, typically 48-72 hours.
// SKILLS & REQUIREMENTS
// FREQUENTLY ASKED QUESTIONS
// READY TO GET STARTED?
Apply in minutes
Create your profile, select your areas of expertise, and start working on frontier AI projects.
Apply Now