1 result found Sort:

An extensible benchmark for evaluating large language models on planning
Created 2022-05-28
34 commits to main branch, last one 9 days ago