1 result found Sort:
[AAAI 2025] ORQA is a new QA benchmark designed to assess the reasoning capabilities of LLMs in a specialized technical domain of Operations Research. The benchmark evaluates whether LLMs can emulate ...
Created
2024-12-21
18 commits to main branch, last one 12 days ago