5 results found Sort:
Deliver safe & effective language models
Created
2022-11-18
5,611 commits to main branch, last one 11 days ago
Python SDK for running evaluations on LLM generated responses
Created
2023-11-22
686 commits to main branch, last one 17 hours ago
A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.
Created
2023-11-19
40 commits to main branch, last one 11 months ago
Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
Created
2023-07-24
1,067 commits to main branch, last one 3 months ago
[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
Created
2024-02-23
19 commits to master branch, last one 5 months ago