3 results found Sort:
LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA
Created
2023-03-30
66 commits to main branch, last one about a year ago
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach
Created
2023-11-16
52 commits to main branch, last one 6 months ago