snu-mllab / DPPO

Official implementation of "Direct Preference-based Policy Optimization without Reward Modeling" (NeurIPS 2023)

Date Created 2023-10-08 (about a year ago)

Commits 13 (last one 7 months ago)

Stargazers 41 (0 this week)

Watchers 2 (0 this week)

Forks 1

License mit

Ranking

RepositoryStats indexes 623,832 repositories, of these snu-mllab/DPPO is ranked #534,421 (14th percentile) for total stargazers, and #478,496 for total watchers. Github reports the primary language for this repository as Python, for repositories using this language it is ranked #105,394/126,749.

snu-mllab/DPPO is also tagged with popular topics, for these it's ranked: reinforcement-learning (#1,232/1420)

All Topics

rlhf reinforcement-learning offline-reinforcement-learning preference-based-reinforcement-learning

Star History

Github stargazers over time

Watcher History

Github watchers over time, collection started in '23

Recent Commit History

13 commits on the default branch (main) since jan '22

Yearly Commits

Commits to the default branch (main) per year

Issue History

Languages

The primary language is Python but there's also others...

updated: 2025-01-06 @ 06:05pm, id: 702096899 / R_kgDOKdkmAw