7 results found Sort:

266
2.8k
apache-2.0
24
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)
Created 2023-07-30
979 commits to main branch, last one 2 days ago
120
1.4k
apache-2.0
18
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Created 2023-05-15
111 commits to main branch, last one 5 months ago
59
784
apache-2.0
9
A simulation framework for RLHF and alternatives. Develop your RLHF method without collecting human data.
Created 2023-05-03
72 commits to main branch, last one 9 months ago
6
132
apache-2.0
3
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Created 2024-06-18
1,073 commits to main branch, last one 6 days ago
2
91
apache-2.0
5
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
Created 2023-07-28
35 commits to main branch, last one about a year ago
A repo for RLHF training and BoN over LLMs, with support for reward model ensembles.
Created 2023-12-02
2 commits to main branch, last one 8 months ago
Official code for ICML 2024 paper, "RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences" (ICML 2024 Spotlight)
Created 2024-04-04
17 commits to main branch, last one about a month ago