2 results found Sort:

12
182
apache-2.0
5
🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
Created 2024-10-15
28 commits to main branch, last one 7 days ago
implementation of distributed reinforcement learning with distributed tensorflow
Created 2020-04-07
44 commits to master branch, last one 4 years ago