9 results found Sort:

56
475
apache-2.0
5
RewardBench: the first evaluation tool for reward models.
Created 2023-12-23
211 commits to main branch, last one 24 days ago
Free and open source code of the https://tournesol.app platform. Meet the community on Discord https://discord.gg/WvcSG55Bf3
Created 2021-03-23
1,602 commits to main branch, last one 14 days ago
Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.
Created 2024-06-01
36 commits to master branch, last one 28 days ago
11
76
isc
7
The MAGICAL benchmark suite for robust imitation learning (NeurIPS 2020)
Created 2019-11-12
144 commits to pyglet1.5 branch, last one 3 years ago
This repository contains the source code for our paper: "NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning". For more details, please refe...
Created 2023-04-20
36 commits to master branch, last one about a month ago
Python-based GUI to collect Feedback of Chemist in Molecules
Created 2024-04-29
31 commits to main branch, last one 4 months ago
2
41
mit
3
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
Created 2024-06-12
6 commits to main branch, last one 6 months ago
Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".
Created 2024-04-25
15 commits to main branch, last one about a month ago
Official code for ICML 2024 paper, "RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences" (ICML 2024 Spotlight)
Created 2024-04-04
17 commits to main branch, last one 2 months ago