2 results found Sort:

Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied with elaborately-written concise descriptions to help readers get ...
Created 2024-04-18
26 commits to main branch, last one 8 months ago
0
53
unknown
1
[AAAI 2025] ORQA is a new QA benchmark designed to assess the reasoning capabilities of LLMs in a specialized technical domain of Operations Research. The benchmark evaluates whether LLMs can emulate ...
Created 2024-12-21
18 commits to main branch, last one 6 days ago