// ROLE SUMMARY

We need people who can think like attackers. You will methodically test AI models against a taxonomy of failure modes: harmful content generation, jailbreak susceptibility, PII leakage, bias amplification, and more.

Safety Evaluation Analyst

Red Teaming$6575/hrRemote(EU)Posted February 10, 2026

// DESCRIPTION

We need people who can think like attackers. You will methodically test AI models against a taxonomy of failure modes: harmful content generation, jailbreak susceptibility, PII leakage, bias amplification, and more. Each test is logged in a structured format that feeds into the safety team's tracking system. Successful exploits are prioritized for mitigation; unsuccessful attempts still provide valuable negative evidence.

A background in cybersecurity, penetration testing, or adversarial ML is ideal, but we have also had strong hires from journalism, law, and creative writing -- anyone who is good at finding holes in systems and articulating what they found. You need to be comfortable working with sensitive content categories (violence, hate speech, self-harm) in a clinical, analytical context. Emotional resilience is not optional.

Onboarding includes a detailed walkthrough of our taxonomy of failure modes, the reporting template, and the specific model you will be testing. After onboarding, you work independently but can raise urgent findings through a priority Slack channel.

// SKILLS & REQUIREMENTS

Emotional resilience when encountering disturbing contentCreative and lateral thinking about system vulnerabilitiesComfort working with sensitive content categoriesAbility to reproduce and clearly document exploitsFamiliarity with prompt engineering techniquesExperience in cybersecurity, pen testing, or adversarial MLBackground in security research, journalism, or law

// FREQUENTLY ASKED QUESTIONS

// READY TO GET STARTED?

Apply in minutes

Create your profile, select your areas of expertise, and start working on frontier AI projects.

Apply Now