yatin-superintelligenceother

Adversarial Agent Intent Safety Analysis 240K

A dataset of 242,454 adversarial prompts with safety evaluations designed to train AI models to identify dual-use threats and malicious intent hidden within legitimate-sounding requests. Features deep intent analysis across 126 risk vectors to decouple surface interpretation from true capability impact.

Downloads87
Episodes242454
Likes8

Why This Matters for Physical AI

While primarily a language-based safety dataset, this work is relevant to physical AI because autonomous robotic systems must similarly detect adversarial intent in natural language commands and recognize when requests mask dangerous dual-use capabilities that could cause physical harm.

Technical Profile

Modalities
language
Task Types
red-teamingsafety-alignmentintent-classification
Episodes
242454
Data Format
parquet
Annotation Types
language_instructionsintent_analysisrisk_labelsclarifying_questions
License
other
Part of the Adversarial Agent Intent Safety Analysis 240K family

Access

Need custom language data?

Claru builds purpose-built datasets for any environment applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets