22 results found Sort:
- Filter by Primary Language:
- Python (19)
- Jupyter Notebook (2)
- C# (1)
- +
Use PEFT or Full-parameter to finetune 400+ LLMs or 100+ MLLMs. (LLM: Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, Gemma2, ...; MLLM: Qwen2-VL, Qwen2-Audio, Llama3.2-Vis...
Created
2023-08-01
1,169 commits to main branch, last one a day ago
MedicalGPT: Training Your Own Medical GPT Model with ChatGPT Training Pipeline. 训练医疗大模型,实现了包括增量预训练(PT)、有监督微调(SFT)、RLHF、DPO、ORPO。
Created
2023-06-02
538 commits to main branch, last one 21 days ago
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Created
2023-12-03
171 commits to main branch, last one a day ago
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
Created
2023-05-25
537 commits to main branch, last one 4 months ago
tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM과 같은 최신 모델의 다운스트림 태스크들을 정리한 Deep Learning NLP 저장소입니다.
Created
2021-12-30
259 commits to main branch, last one 2 months ago
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"
Created
2024-06-24
19 commits to main branch, last one 4 months ago
An Efficient "Factory" to Build Multiple LoRA Adapters
Created
2023-08-24
303 commits to main branch, last one 25 days ago
Align Anything: Training All-modality Model with Feedback
Created
2024-07-14
57 commits to main branch, last one 10 days ago
SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.
Created
2024-01-11
459 commits to main branch, last one 19 hours ago
Notus is a collection of fine-tuned LLMs using SFT, DPO, SFT+DPO, and/or any other RLHF techniques, while always keeping a data-first approach
Created
2023-11-16
52 commits to main branch, last one 11 months ago
Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step
Created
2024-05-26
2 commits to main branch, last one 4 months ago
Technical anaysis library for .NET
Created
2016-06-30
64 commits to master branch, last one 3 years ago
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Created
2024-10-09
10 commits to main branch, last one 19 days ago
A RLHF Infrastructure for Vision-Language Models
Created
2023-12-27
7 commits to main branch, last one 6 days ago
This repository contains the code for SFT, RLHF, and DPO, designed for vision-based LLMs, including the LLaVA models and the LLaMA-3.2-vision models.
Created
2024-06-29
40 commits to master branch, last one about a month ago
CodeUltraFeedback: aligning large language models to coding preferences
Created
2024-01-25
51 commits to main branch, last one 4 months ago
SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights
Created
2024-10-11
9 commits to main branch, last one about a month ago
🌾 OAT: Online AlignmenT for LLMs
Created
2024-10-15
13 commits to main branch, last one 9 days ago
[NeurIPS 2024] Official code of $\beta$-DPO: Direct Preference Optimization with Dynamic $\beta$
Created
2024-05-22
9 commits to main branch, last one 29 days ago
[ACL 2024] Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
Created
2024-06-04
9 commits to main branch, last one 3 months ago
基于DPO算法微调语言大模型,简单好上手。
Created
2024-03-27
16 commits to master branch, last one 4 months ago
Various training, inference and validation code and results related to Open LLM's that were pretrained (full or partially) on the Dutch language.
Created
2023-07-02
29 commits to main branch, last one 7 months ago