9 results found Sort:
- Filter by Primary Language:
- Python (8)
- Jupyter Notebook (1)
- +
RewardBench: the first evaluation tool for reward models.
Created
2023-12-23
210 commits to main branch, last one about a month ago
Free and open source code of the https://tournesol.app platform. Meet the community on Discord https://discord.gg/WvcSG55Bf3
Created
2021-03-23
1,595 commits to main branch, last one 5 days ago
Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.
Created
2024-06-01
35 commits to master branch, last one 18 days ago
The MAGICAL benchmark suite for robust imitation learning (NeurIPS 2020)
Created
2019-11-12
144 commits to pyglet1.5 branch, last one 3 years ago
This repository contains the source code for our paper: "NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning". For more details, please refe...
Created
2023-04-20
36 commits to master branch, last one 24 days ago
Python-based GUI to collect Feedback of Chemist in Molecules
Created
2024-04-29
31 commits to main branch, last one 3 months ago
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
Created
2024-06-12
6 commits to main branch, last one 5 months ago
Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".
Created
2024-04-25
15 commits to main branch, last one 10 days ago
Official code for ICML 2024 paper, "RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences" (ICML 2024 Spotlight)
Created
2024-04-04
17 commits to main branch, last one about a month ago