6 results found Sort:
- Filter by Primary Language:
- Python (4)
- Jupyter Notebook (1)
- +
Evaluate your LLM's response with Prometheus and GPT4 💯
Created
2024-04-18
203 commits to main branch, last one about a month ago
🤠 Agent-as-a-Judge and DevAI dataset
Created
2024-10-16
20 commits to main branch, last one 5 days ago
xFinder: Robust and Pinpoint Answer Extraction for Large Language Models
Created
2024-05-19
33 commits to main branch, last one 9 days ago
CodeUltraFeedback: aligning large language models to coding preferences
Created
2024-01-25
51 commits to main branch, last one 4 months ago
This is the repo for the survey of Bias and Fairness in IR with LLMs.
Created
2024-03-18
49 commits to main branch, last one 8 days ago
Official implementation for "MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?"
Created
2024-06-11
25 commits to main branch, last one 3 months ago