TianduoWang / DPO-ST

[ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning

Date Created 2024-06-04 (6 months ago)

Commits 9 (last one 4 months ago)

Stargazers 30 (0 this week)

Watchers 3 (0 this week)

Forks 5

License mit

Ranking

RepositoryStats indexes 595,856 repositories, of these TianduoWang/DPO-ST is ranked #569,455 (4th percentile) for total stargazers, and #427,587 for total watchers. Github reports the primary language for this repository as Python, for repositories using this language it is ranked #113,204/119,431.

Other Information

Homepage URL: https://arxiv.org/abs/2407.18248

All Topics

dpo chain-of-thought math-word-problem

Star History

Github stargazers over time

Watcher History

Github watchers over time, collection started in '23

Recent Commit History

9 commits on the default branch (main) since jan '22

Yearly Commits

Commits to the default branch (main) per year

Issue History

No issues have been posted

Languages

The only known language in this repository is Python

updated: 2024-12-18 @ 11:33pm, id: 810386399 / R_kgDOME2D3w