4 results found Sort:

70
1.1k
apache-2.0
12
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Created 2023-10-16
618 commits to main branch, last one 5 days ago
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
Created 2023-09-17
36 commits to main branch, last one 2 days ago
ZYN: Zero-Shot Reward Models with Yes-No Questions
Created 2023-03-03
21 commits to main branch, last one 10 months ago
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
Created 2023-09-10
58 commits to main branch, last one 3 months ago