23 results found Sort:

412
4.7k
apache-2.0
23
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) or 100+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, InternVL...
Created 2023-08-01
1,273 commits to main branch, last one 6 hours ago
513
3.4k
apache-2.0
38
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Created 2023-06-02
540 commits to main branch, last one 4 days ago
46
768
apache-2.0
8
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Created 2023-12-03
212 commits to main branch, last one 9 days ago
63
583
apache-2.0
9
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
Created 2023-05-25
537 commits to main branch, last one 5 months ago
tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.
Created 2021-12-30
259 commits to main branch, last one 3 months ago
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
Created 2024-06-24
19 commits to main branch, last one 5 months ago
54
284
apache-2.0
3
An Efficient "Factory" to Build Multiple LoRA Adapters
Created 2023-08-24
306 commits to main branch, last one 12 days ago
Align Anything: Training All-modality Model with Feedback
Created 2024-07-14
66 commits to main branch, last one 2 days ago
23
234
mit
8
SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.
Created 2024-01-11
485 commits to main branch, last one 2 days ago
Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization
Created 2024-05-26
1 commits to main branch, last one 4 days ago
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach
Created 2023-11-16
52 commits to main branch, last one about a year ago
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Created 2024-10-09
10 commits to main branch, last one about a month ago
7
140
apache-2.0
4
A RLHF Infrastructure for Vision-Language Models
Created 2023-12-27
7 commits to main branch, last one about a month ago
Technical anaysis library for .NET
Created 2016-06-30
64 commits to master branch, last one 3 years ago
This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.
Created 2024-06-29
40 commits to master branch, last one 2 months ago
6
72
apache-2.0
5
🌾 OAT: Online AlignmenT for LLMs
Created 2024-10-15
22 commits to main branch, last one 13 hours ago
CodeUltraFeedback: aligning large language models to coding preferences
Created 2024-01-25
51 commits to main branch, last one 5 months ago
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights
Created 2024-10-11
9 commits to main branch, last one 2 months ago
0
36
unknown
2
[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
Created 2024-05-22
9 commits to main branch, last one about a month ago
[ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
Created 2024-06-04
9 commits to main branch, last one 4 months ago
基于DPO算法微调语言大模型,简单好上手。
Created 2024-03-27
16 commits to master branch, last one 5 months ago
Various training, inference and validation code and results related to Open LLM's that were pretrained (full or partially) on the Dutch language.
Created 2023-07-02
29 commits to main branch, last one 8 months ago
0
25
apache-2.0
3
Framework for building synthetic datasets with AI feedback
Created 2024-10-28
231 commits to master branch, last one 18 hours ago