6 results found Sort:

160
2.2k
apache-2.0
19
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created 2023-10-16
840 commits to main branch, last one 4 days ago
4
151
apache-2.0
6
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
Created 2023-09-17
41 commits to main branch, last one 9 days ago
Official Implementation of VideoDPO
Created 2024-12-19
91 commits to main branch, last one 20 days ago
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
Created 2023-09-10
58 commits to main branch, last one 10 months ago
ZYN: Zero-Shot Reward Models with Yes-No Questions
Created 2023-03-03
21 commits to main branch, last one about a year ago
0
27
apache-2.0
3
This repository has no description...
Created 2024-10-28
260 commits to master branch, last one about a month ago