74 results found Sort:

3.2k
37.0k
apache-2.0
431
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Created 2022-12-13
3,477 commits to main branch, last one 10 months ago
4.2k
33.8k
apache-2.0
209
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Created 2023-05-28
2,349 commits to main branch, last one 2 days ago
811
10.3k
unknown
158
The official GitHub page for the survey paper "A Survey of Large Language Models".
Created 2023-03-14
139 commits to main branch, last one 2 months ago
578
7.1k
apache-2.0
79
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Created 2023-07-18
264 commits to main branch, last one about a month ago
454
6.4k
apache-2.0
58
Official release of InternLM2.5 base and chat models. 1M context support
Created 2023-07-06
233 commits to main branch, last one 27 days ago
406
4.7k
apache-2.0
111
Robust recipes to align language models with human and AI preferences
Created 2023-08-25
104 commits to main branch, last one about a month ago
375
3.9k
apache-2.0
31
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
Created 2021-04-28
3,514 commits to develop branch, last one 13 hours ago
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
This repository has been archived (exclude archived)
Created 2023-04-08
287 commits to main branch, last one about a year ago
211
3.4k
apache-2.0
62
A curated list of reinforcement learning with human feedback resources (continually updated)
Created 2023-02-13
63 commits to main branch, last one 9 days ago
205
3.2k
other
120
A Doctor for your data
Created 2023-05-02
32 commits to master branch, last one 9 months ago
127
1.6k
apache-2.0
16
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created 2023-10-16
778 commits to main branch, last one 21 days ago
135
1.6k
apache-2.0
25
WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
Created 2023-05-28
28 commits to main branch, last one about a year ago
241
1.5k
apache-2.0
8
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Created 2023-05-25
587 commits to main branch, last one 14 days ago
119
1.3k
apache-2.0
17
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Created 2023-05-15
111 commits to main branch, last one 4 months ago
65
1.2k
apache-2.0
14
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Created 2023-04-01
42 commits to main branch, last one about a month ago
140
885
apache-2.0
30
Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.
Created 2022-08-08
1,009 commits to main branch, last one about a month ago
Recipes to train reward model for RLHF.
Created 2024-03-21
90 commits to main branch, last one 3 days ago
45
735
apache-2.0
8
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Created 2023-12-03
164 commits to main branch, last one 4 days ago
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
Created 2024-05-21
78 commits to main branch, last one 2 days ago
Aligning Large Language Models with Human: A Survey
Created 2023-07-23
50 commits to main branch, last one about a year ago
聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)
Created 2023-04-30
9 commits to main branch, last one about a year ago
63
575
apache-2.0
9
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
Created 2023-05-25
537 commits to main branch, last one 4 months ago
60
542
mit
11
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
Created 2021-03-18
70 commits to main branch, last one 6 months ago
62
490
apache-2.0
28
The official implementation of Self-Play Preference Optimization (SPPO)
Created 2024-06-13
26 commits to main branch, last one 3 months ago
MindSpore online courses: Step into LLM
Created 2023-03-21
126 commits to master branch, last one 13 days ago
50
423
apache-2.0
5
RewardBench: the first evaluation tool for reward models.
Created 2023-12-23
210 commits to main branch, last one 14 days ago
46
410
unknown
18
A recipe for online RLHF and online iterative DPO.
Created 2024-05-10
31 commits to main branch, last one a day ago
pykoi: Active learning in one unified interface
Created 2023-07-14
595 commits to main branch, last one 10 months ago
Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.
Created 2023-12-24
904 commits to main branch, last one a day ago
LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)
Created 2023-06-12
28 commits to main branch, last one about a year ago