4 results found Sort:
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Created
2023-10-16
618 commits to main branch, last one 5 days ago
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
Created
2023-09-17
36 commits to main branch, last one 2 days ago
ZYN: Zero-Shot Reward Models with Yes-No Questions
Created
2023-03-03
21 commits to main branch, last one 10 months ago
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
Created
2023-09-10
58 commits to main branch, last one 3 months ago