3 results found Sort:

13
232
apache-2.0
5
🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.
Created 2024-10-15
31 commits to main branch, last one 2 days ago
26
132
apache-2.0
3
:bust_in_silhouette: Multi-Armed Bandit Algorithms Library (MAB) :cop:
Created 2019-01-24
69 commits to master branch, last one 2 years ago
7
54
apache-2.0
7
Library for multi-armed bandit selection strategies, including efficient deterministic implementations of Thompson sampling and epsilon-greedy.
Created 2021-02-18
20 commits to main branch, last one 4 years ago