9 results found Sort:

53
446
apache-2.0
5
RewardBench: the first evaluation tool for reward models.
Created 2023-12-23
210 commits to main branch, last one about a month ago
Free and open source code of the https://tournesol.app platform. Meet the community on Discord https://discord.gg/WvcSG55Bf3
Created 2021-03-23
1,595 commits to main branch, last one 5 days ago
Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.
Created 2024-06-01
35 commits to master branch, last one 18 days ago
11
75
isc
7
The MAGICAL benchmark suite for robust imitation learning (NeurIPS 2020)
Created 2019-11-12
144 commits to pyglet1.5 branch, last one 3 years ago
This repository contains the source code for our paper: "NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning". For more details, please refe...
Created 2023-04-20
36 commits to master branch, last one 24 days ago
Python-based GUI to collect Feedback of Chemist in Molecules
Created 2024-04-29
31 commits to main branch, last one 3 months ago
2
39
mit
3
Official implementation of Bootstrapping Language Models via DPO Implicit Rewards
Created 2024-06-12
6 commits to main branch, last one 5 months ago
Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".
Created 2024-04-25
15 commits to main branch, last one 10 days ago
Official code for ICML 2024 paper, "RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences" (ICML 2024 Spotlight)
Created 2024-04-04
17 commits to main branch, last one about a month ago