Lennittusmit

DESPITE: Deterministic Evaluation of Safe Planning In embodied Task Execution

A benchmark for evaluating large language models on embodied safe task planning, derived from multiple sources including ALFRED, BDDL, VirtualHome, NormBank, and NEISS.

Downloads128
Likes4

Technical Profile

Modalities
language
Environment
simulation
Task Types
task-planningmanipulation
License
mit
Part of the DESPITE: Deterministic Evaluation of Safe Planning In embodied Task Execution family

Access

Need custom language data?

Claru builds purpose-built datasets for simulation applications with dense human annotations and quality assurance.

Request a Sample Pack

Related Datasets