6 results found Sort:
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created
2023-10-16
840 commits to main branch, last one 4 days ago
A continually updated list of literature on Reinforcement Learning from AI Feedback (RLAIF)
Created
2023-09-17
41 commits to main branch, last one 9 days ago
Official Implementation of VideoDPO
Created
2024-12-19
91 commits to main branch, last one 20 days ago
code for paper Query-Dependent Prompt Evaluation and Optimization with Offline Inverse Reinforcement Learning
Created
2023-09-10
58 commits to main branch, last one 10 months ago
ZYN: Zero-Shot Reward Models with Yes-No Questions
Created
2023-03-03
21 commits to main branch, last one about a year ago
This repository has no description...
Created
2024-10-28
260 commits to master branch, last one about a month ago