58 results found Sort:

3.2k
36.8k
apache-2.0
422
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Created 2022-12-13
3,477 commits to main branch, last one 4 months ago
2.9k
23.4k
apache-2.0
160
Unify Efficient Fine-Tuning of 100+ LLMs
Created 2023-05-28
1,568 commits to main branch, last one a day ago
710
9.2k
unknown
139
The official GitHub page for the survey paper "A Survey of Large Language Models".
Created 2023-03-14
138 commits to main branch, last one 13 days ago
563
6.9k
apache-2.0
75
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Created 2023-07-18
263 commits to main branch, last one about a month ago
383
5.4k
apache-2.0
49
Official release of InternLM2 7B and 20B base and chat models. 200K context support
Created 2023-07-06
218 commits to main branch, last one about a month ago
337
4.0k
apache-2.0
111
Robust recipes to align language models with human and AI preferences
Created 2023-08-25
95 commits to main branch, last one 23 days ago
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
This repository has been archived (exclude archived)
Created 2023-04-08
287 commits to main branch, last one 7 months ago
317
3.2k
apache-2.0
25
Argilla is a collaboration platform for AI engineers and domain experts that require high-quality outputs, full data ownership, and overall efficiency.
Created 2021-04-28
3,039 commits to develop branch, last one a day ago
173
3.0k
other
109
A Doctor for your data
Created 2023-05-02
32 commits to master branch, last one 4 months ago
186
2.9k
apache-2.0
55
A curated list of reinforcement learning with human feedback resources (continually updated)
Created 2023-02-13
57 commits to main branch, last one 5 days ago
133
1.5k
apache-2.0
25
WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
Created 2023-05-28
28 commits to main branch, last one 10 months ago
174
1.2k
apache-2.0
9
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Created 2023-05-25
498 commits to main branch, last one 24 hours ago
104
1.2k
apache-2.0
17
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Created 2023-05-15
110 commits to main branch, last one about a month ago
65
1.0k
apache-2.0
12
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Created 2023-10-16
573 commits to main branch, last one 3 days ago
52
986
apache-2.0
14
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Created 2023-04-01
31 commits to main branch, last one 8 months ago
118
750
apache-2.0
29
Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.
Created 2022-08-08
1,002 commits to main branch, last one 2 days ago
Aligning Large Language Models with Human: A Survey
Created 2023-07-23
50 commits to main branch, last one 8 months ago
30
595
apache-2.0
6
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Created 2023-12-03
111 commits to main branch, last one a day ago
聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)
Created 2023-04-30
9 commits to main branch, last one 11 months ago
59
535
apache-2.0
9
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
Created 2023-05-25
455 commits to main branch, last one a day ago
65
523
mit
11
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
Created 2021-03-18
70 commits to main branch, last one 22 days ago
pykoi: Active learning in one unified interface
Created 2023-07-14
595 commits to main branch, last one 5 months ago
MindSpore online courses: Step into LLM
Created 2023-03-21
91 commits to master branch, last one 2 months ago
LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)
Created 2023-06-12
28 commits to main branch, last one 7 months ago
20
307
unknown
7
SimPO: Simple Preference Optimization with a Reference-Free Reward
Created 2024-05-21
12 commits to main branch, last one 17 hours ago
🛰️ 基于真实医疗对话数据在ChatGLM上进行LoRA、P-Tuning V2、Freeze、RLHF等微调,我们的眼光不止于医疗问答
This repository has been archived (exclude archived)
Created 2023-05-04
76 commits to main branch, last one 9 months ago
Recipes to train reward model for RLHF.
Created 2024-03-21
55 commits to main branch, last one 2 days ago
A curated list of Human Preference Datasets for LLM fine-tuning, RLHF, and eval.
Created 2023-05-03
8 commits to main branch, last one 8 months ago
23
228
apache-2.0
4
RewardBench: the first evaluation tool for reward models.
Created 2023-12-23
167 commits to main branch, last one 7 days ago
Chain-of-Hindsight, A Scalable RLHF Method
Created 2023-02-20
17 commits to main branch, last one 8 months ago