// ROLE SUMMARY

We need evaluators who can read AI-generated text critically and make consistent quality judgments under detailed rubrics. On a given day you might compare two explanations of quantum mechanics, two pieces of marketing copy, and two responses to a sensitive personal question.

Conversational AI Rater

RLHF$4045/hrRemotePosted January 31, 2026

// DESCRIPTION

We need evaluators who can read AI-generated text critically and make consistent quality judgments under detailed rubrics. On a given day you might compare two explanations of quantum mechanics, two pieces of marketing copy, and two responses to a sensitive personal question. The common thread is careful reading, rubric application, and clear written justifications. Speed matters, but not at the expense of thoughtfulness.

We look for people with sharp critical reading skills and the intellectual range to evaluate responses on topics they may not be experts in. You do not need to know everything -- but you do need to know how to spot when an AI is confidently wrong, subtly misleading, or superficially helpful without actually addressing the question. Prior experience with RLHF annotation pipelines (e.g., at Scale, Surge, or Invisible) is a strong plus.

This is intellectually demanding work. We recommend working in focused blocks with breaks rather than marathon sessions. Most annotators settle into a rhythm of 4-5 hour focused sessions, 4-5 days per week. Compensation is hourly, with accuracy bonuses.

// SKILLS & REQUIREMENTS

Excellent written English communicationFamiliarity with LLM capabilities and failure modesBackground in linguistics, philosophy, law, or STEMGraduate-level education or equivalent professional experienceComfort evaluating content across diverse subject areasExperience with RLHF or preference labeling pipelines

// FREQUENTLY ASKED QUESTIONS

// READY TO GET STARTED?

Apply in minutes

Create your profile, select your areas of expertise, and start working on frontier AI projects.

Apply Now