5 results found Sort:
Deliver safe & effective language models
Created
2022-11-18
5,461 commits to main branch, last one 2 months ago
Python SDK for running evaluations on LLM generated responses
Created
2023-11-22
606 commits to main branch, last one 8 hours ago
A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.
Created
2023-11-19
40 commits to main branch, last one 10 months ago
Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
Created
2023-07-24
1,067 commits to main branch, last one 2 months ago
[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
Created
2024-02-23
19 commits to master branch, last one 4 months ago