3 results found Sort:

12
186
apache-2.0
5
🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
Created 2024-10-15
28 commits to main branch, last one 11 days ago
26
131
apache-2.0
4
:bust_in_silhouette: Multi-Armed Bandit Algorithms Library (MAB) :cop:
Created 2019-01-24
69 commits to master branch, last one 2 years ago
7
53
apache-2.0
8
Library for multi-armed bandit selection strategies, including efficient deterministic implementations of Thompson sampling and epsilon-greedy.
Created 2021-02-18
20 commits to main branch, last one 3 years ago