5 results found Sort:
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created
2023-10-16
779 commits to main branch, last one 3 days ago
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
Created
2023-09-17
39 commits to main branch, last one 2 months ago
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
Created
2023-09-10
58 commits to main branch, last one 9 months ago
ZYN: Zero-Shot Reward Models with Yes-No Questions
Created
2023-03-03
21 commits to main branch, last one about a year ago
Framework for building synthetic datasets with AI feedback
Created
2024-10-28
231 commits to master branch, last one a day ago