6 results found Sort:

179
2.5k
apache-2.0
22
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created 2023-10-16
840 commits to main branch, last one about a month ago
4
157
apache-2.0
6
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
Created 2023-09-17
41 commits to main branch, last one about a month ago
Official Implementation of VideoDPO
Created 2024-12-19
91 commits to main branch, last one about a month ago
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
Created 2023-09-10
58 commits to main branch, last one 11 months ago
ZYN: Zero-Shot Reward Models with Yes-No Questions
Created 2023-03-03
21 commits to main branch, last one about a year ago
0
27
apache-2.0
2
This repository has no description...
Created 2024-10-28
260 commits to master branch, last one 2 months ago