7 results found Sort:

337
3.6k
apache-2.0
28
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Created 2023-07-30
1,042 commits to main branch, last one 19 hours ago
119
1.4k
apache-2.0
18
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Created 2023-05-15
111 commits to main branch, last one 6 months ago
59
788
apache-2.0
9
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
Created 2023-05-03
72 commits to main branch, last one 10 months ago
10
175
apache-2.0
5
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Created 2024-06-18
1,077 commits to main branch, last one 7 days ago
2
92
apache-2.0
6
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
Created 2023-07-28
35 commits to main branch, last one about a year ago
A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.
Created 2023-12-02
2 commits to main branch, last one 10 months ago
Official code for ICML 2024 paper, "RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences" (ICML 2024 Spotlight)
Created 2024-04-04
17 commits to main branch, last one 2 months ago