yatin-superintelligenceother
Adversarial Agent Intent Safety Analysis 240K
A dataset of 242,454 adversarial prompts with safety evaluations designed to train AI models to identify dual-use threats and malicious intent hidden within legitimate-sounding requests. Features deep intent analysis across 126 risk vectors to decouple surface interpretation from true capability impact.
Downloads87
Episodes242454
Likes8
Why This Matters for Physical AI
While primarily a language-based safety dataset, this work is relevant to physical AI because autonomous robotic systems must similarly detect adversarial intent in natural language commands and recognize when requests mask dangerous dual-use capabilities that could cause physical harm.
Technical Profile
- Modalities
- language
- Task Types
- red-teamingsafety-alignmentintent-classification
- Episodes
- 242454
- Data Format
- parquet
- Annotation Types
- language_instructionsintent_analysisrisk_labelsclarifying_questions
- License
- other
Access
Need custom language data?
Claru builds purpose-built datasets for any environment applications with dense human annotations and quality assurance.
Request a Sample Pack