5 results found Sort:

143
1.8k
apache-2.0
17
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created 2023-10-16
779 commits to main branch, last one 3 days ago
4
143
apache-2.0
6
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
Created 2023-09-17
39 commits to main branch, last one 2 months ago
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
Created 2023-09-10
58 commits to main branch, last one 9 months ago
ZYN: Zero-Shot Reward Models with Yes-No Questions
Created 2023-03-03
21 commits to main branch, last one about a year ago
0
25
apache-2.0
3
Framework for building synthetic datasets with AI feedback
Created 2024-10-28
231 commits to master branch, last one a day ago