3 results found Sort:

22
163
apache-2.0
2
LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA
Created 2023-03-30
66 commits to main branch, last one about a year ago
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach
Created 2023-11-16
52 commits to main branch, last one 6 months ago
1
25
unknown
2
realize the reinforcement learning training for gpt2 llama bloom and so on llm model
Created 2023-04-19
123 commits to main branch, last one 9 months ago