Search Results - RepositoryStats

2 results found Sort:

AttrScore OSU-NLP-Group

mit

Code, datasets, models for the paper "Automatic Evaluation of Attribution by Large Language Models"

llms gpt-4 chatgpt attribution evaluation-llms large-language-model large-language-models natural-language-processing

Created 2023-05-08

13 commits to main branch, last one about a year ago

CompBench RaptorMai

other

CompBench evaluates the comparative reasoning of multimodal large language models (MLLMs) with 40K image pairs and questions across 8 dimensions of relative comparison: visual attribute, existence, st...

llms benchmark reasoning evaluation-llms human-annotation foundation-models llms-benchmarking vision-and-language large-language-models vision-language-model multimodal-deep-learning multimodal-large-language-models

Created 2024-07-23

4 commits to main branch, last one 5 months ago