Search Results - RepositoryStats

reward-bench allenai

65

557

apache-2.0

6

RewardBench: the first evaluation tool for reward models.

rlhf preference-learning

Created 2023-12-23

219 commits to main branch, last one about a month ago

tournesol tournesol-app

50

348

other

10

Free and open source code of the https://tournesol.app platform. Meet the community on Discord https://discord.gg/WvcSG55Bf3

django python dataset reactjs youtube ai-ethics social-choice bradley-terry-model preference-learning django-rest-framework recommendation-engine preference-aggregation golden-ratio-optimization

Created 2021-03-23

1,654 commits to main branch, last one 5 days ago

ICSFSurvey IAAR-Shanghai

5

166

unknown

4

Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.

decoding reasoning self-refine self-correct hallucination self-feedback attention-head self-correction chain-of-thought self-consistency self-improvement data-augmentation preference-learning internal-consistency large-language-model large-language-models knowledge-distillation

Created 2024-06-01

36 commits to master branch, last one 4 months ago

magical qxcv

11

77

isc

6

The MAGICAL benchmark suite for robust imitation learning (NeurIPS 2020)

imitation-learning preference-learning reinforcement-learning reinforcement-learning-environments

Created 2019-11-12

144 commits to pyglet1.5 branch, last one 3 years ago

SAN-NaviSTAR SMARTlab-Purdue

5

55

mit

4

This repository contains the source code for our paper: "NaviSTAR: Socially Aware Robot Navigation with Hybrid Spatio-Temporal Graph Transformer and Preference Learning". For more details, please refe...

transformer machine-learning robot-navigation preference-learning reinforcement-learning socially-aware-navigation

Created 2023-04-20

38 commits to master branch, last one about a month ago