2 results found Sort:

Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"
Created 2023-05-08
13 commits to main branch, last one about a year ago
CompBench evaluates the comparative reasoning of multimodal large language models (MLLMs) with 40K image pairs and questions across 8 dimensions of relative comparison: visual attribute, existence, st...
Created 2024-07-23
4 commits to main branch, last one 5 months ago