Lennittusmit
DESPITE: Deterministic Evaluation of Safe Planning In embodied Task Execution
A benchmark for evaluating large language models on embodied safe task planning, derived from multiple sources including ALFRED, BDDL, VirtualHome, NormBank, and NEISS.
Downloads128
Likes4
Technical Profile
- Modalities
- language
- Environment
- simulation
- Task Types
- task-planningmanipulation
- License
- mit
Access
Need custom language data?
Claru builds purpose-built datasets for simulation applications with dense human annotations and quality assurance.
Request a Sample Pack