84 results found Sort:

4.6k
37.3k
apache-2.0
219
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Created 2023-05-28
2,533 commits to main branch, last one 22 hours ago
3.3k
37.2k
apache-2.0
433
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Created 2022-12-13
3,477 commits to main branch, last one 12 months ago
833
10.7k
unknown
159
The official GitHub page for the survey paper "A Survey of Large Language Models".
Created 2023-03-14
139 commits to main branch, last one 4 months ago
580
7.1k
apache-2.0
78
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Created 2023-07-18
264 commits to main branch, last one 3 months ago
464
6.6k
apache-2.0
59
Official release of InternLM2.5 base and chat models. 1M context support
Created 2023-07-06
235 commits to main branch, last one about a month ago
420
4.9k
apache-2.0
111
Robust recipes to align language models with human and AI preferences
Created 2023-08-25
106 commits to main branch, last one about a month ago
389
4.2k
apache-2.0
32
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
Created 2021-04-28
3,603 commits to develop branch, last one 15 days ago
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
This repository has been archived (exclude archived)
Created 2023-04-08
287 commits to main branch, last one about a year ago
219
3.6k
apache-2.0
61
A curated list of reinforcement learning with human feedback resources (continually updated)
Created 2023-02-13
70 commits to main branch, last one about a month ago
209
3.3k
other
125
A Doctor for your data
Created 2023-05-02
32 commits to master branch, last one 11 months ago
147
1.8k
apache-2.0
17
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created 2023-10-16
780 commits to main branch, last one 12 days ago
249
1.6k
apache-2.0
8
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Created 2023-05-25
595 commits to main branch, last one 8 days ago
137
1.6k
apache-2.0
25
WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
Created 2023-05-28
29 commits to main branch, last one 22 days ago
119
1.4k
apache-2.0
18
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Created 2023-05-15
111 commits to main branch, last one 6 months ago
101
1.3k
apache-2.0
35
Secrets of RLHF in Large Language Models Part I: PPO
Created 2023-07-05
47 commits to main branch, last one 10 months ago
65
1.2k
apache-2.0
15
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Created 2023-04-01
42 commits to main branch, last one 3 months ago
76
1.1k
apache-2.0
21
Recipes to train reward model for RLHF.
Created 2024-03-21
128 commits to main branch, last one 24 days ago
151
929
apache-2.0
31
Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.
Created 2022-08-08
1,033 commits to main branch, last one about a month ago
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
Created 2024-05-21
78 commits to main branch, last one 2 months ago
48
772
apache-2.0
7
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Created 2023-12-03
227 commits to main branch, last one 3 days ago
Aligning Large Language Models with Human: A Survey
Created 2023-07-23
50 commits to main branch, last one about a year ago
聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)
Created 2023-04-30
9 commits to main branch, last one about a year ago
63
586
apache-2.0
9
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
Created 2023-05-25
537 commits to main branch, last one 5 months ago
59
547
mit
11
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
Created 2021-03-18
70 commits to main branch, last one 8 months ago
62
512
apache-2.0
28
The official implementation of Self-Play Preference Optimization (SPPO)
Created 2024-06-13
27 commits to main branch, last one about a month ago
56
474
apache-2.0
5
RewardBench: the first evaluation tool for reward models.
Created 2023-12-23
211 commits to main branch, last one 24 days ago
51
471
unknown
19
A recipe for online RLHF and online iterative DPO.
Created 2024-05-10
33 commits to main branch, last one 8 days ago
MindSpore online courses: Step into LLM
Created 2023-03-21
126 commits to master branch, last one 2 months ago
Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.
Created 2023-12-24
983 commits to main branch, last one a day ago
pykoi: Active learning in one unified interface
Created 2023-07-14
595 commits to main branch, last one about a year ago