5 results found Sort:

23
197
apache-2.0
2
LLaMA-TRL: Fine-tuning LLaMA with PPO and LoRA
Created 2023-03-30
66 commits to main branch, last one about a year ago
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach
Created 2023-11-16
52 commits to main branch, last one about a year ago
基于DPO算法微调语言大模型,简单好上手。
Created 2024-03-27
16 commits to master branch, last one 7 months ago
Various training, inference and validation code and results related to Open LLM's that were pretrained (full or partially) on the Dutch language.
Created 2023-07-02
29 commits to main branch, last one 9 months ago
2
26
unknown
2
realize the reinforcement learning training for gpt2 llama bloom and so on llm model
Created 2023-04-19
123 commits to main branch, last one about a year ago