4 results found Sort:
- Filter by Primary Language:
- Python (3)
- TypeScript (1)
- +
Test your prompts, agents, and RAGs. Use LLM evals to improve your app's quality and catch problems. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with comman...
Created
2023-04-28
1,361 commits to main branch, last one 20 hours ago
The LLM Evaluation Framework
Created
2023-08-10
3,019 commits to main branch, last one a day ago
The official evaluation suite and dynamic data release for MixEval.
Created
2024-06-01
35 commits to main branch, last one 7 days ago
Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
Created
2023-07-24
925 commits to main branch, last one 6 days ago