// ROLE SUMMARY

You will evaluate code generated by AI models. Each task shows you a programming prompt and one or more candidate solutions.

Software Output Evaluator

Coding Review$40–50/hrRemote(EU)Posted February 8, 2026

// DESCRIPTION

You will evaluate code generated by AI models. Each task shows you a programming prompt and one or more candidate solutions. Your job is to assess correctness, efficiency, readability, and adherence to best practices, then rank the solutions and write a brief justification. Languages vary by project but commonly include Python, JavaScript/TypeScript, Java, C++, and SQL. Some tasks also ask you to identify bugs, suggest fixes, or rate the quality of inline comments and documentation.

We look for developers who care about code quality -- not just whether it runs, but whether it is maintainable, efficient, and clear. Experience with code review tools (GitHub PRs, Gerrit, Crucible) is a plus. Strong knowledge of software testing principles helps because some evaluations require you to reason about edge cases and test coverage.

After a one-hour onboarding session covering the evaluation rubric and annotation tool, you start with a calibration set of 10 tasks. Once you pass calibration, live tasks are available immediately. Turnaround expectations are per-batch, typically 48-72 hours.

// SKILLS & REQUIREMENTS

Familiarity with multiple programming paradigmsKnowledge of software testing principlesExperience with version control (Git)Ability to identify bugs, anti-patterns, and security issuesClear written communication for evaluation justifications

Apply Now

// FREQUENTLY ASKED QUESTIONS

// RELATED POSITIONS

More Coding Review roles

Coding Review

// READY TO GET STARTED?

Apply in minutes

Create your profile, select your areas of expertise, and start working on frontier AI projects.

Apply Now

Software Output Evaluator

// DESCRIPTION

// SKILLS & REQUIREMENTS

// FREQUENTLY ASKED QUESTIONS

More Coding Review roles

Technical Code Assessor

Code Correctness Analyst

Programming Task Evaluator

Code Review Annotator

Apply in minutes