84 results found Sort:
- Filter by Primary Language:
- Python (66)
- Jupyter Notebook (6)
- HTML (2)
- TypeScript (2)
- Makefile (1)
- TeX (1)
- +
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Created
2023-05-28
2,533 commits to main branch, last one 22 hours ago
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Created
2022-12-13
3,477 commits to main branch, last one 12 months ago
The official GitHub page for the survey paper "A Survey of Large Language Models".
Created
2023-03-14
139 commits to main branch, last one 4 months ago
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Created
2023-07-18
264 commits to main branch, last one 3 months ago
Official release of InternLM2.5 base and chat models. 1M context support
Created
2023-07-06
235 commits to main branch, last one about a month ago
Robust recipes to align language models with human and AI preferences
Created
2023-08-25
106 commits to main branch, last one about a month ago
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
Created
2021-04-28
3,603 commits to develop branch, last one 15 days ago
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
This repository has been archived
(exclude archived)
Created
2023-04-08
287 commits to main branch, last one about a year ago
A curated list of reinforcement learning with human feedback resources (continually updated)
Created
2023-02-13
70 commits to main branch, last one about a month ago
A Doctor for your data
Created
2023-05-02
32 commits to master branch, last one 11 months ago
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created
2023-10-16
780 commits to main branch, last one 12 days ago
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Created
2023-05-25
595 commits to main branch, last one 8 days ago
WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
Created
2023-05-28
29 commits to main branch, last one 22 days ago
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Created
2023-05-15
111 commits to main branch, last one 6 months ago
Secrets of RLHF in Large Language Models Part I: PPO
Created
2023-07-05
47 commits to main branch, last one 10 months ago
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Created
2023-04-01
42 commits to main branch, last one 3 months ago
Recipes to train reward model for RLHF.
Created
2024-03-21
128 commits to main branch, last one 24 days ago
Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.
Created
2022-08-08
1,033 commits to main branch, last one about a month ago
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
Created
2024-05-21
78 commits to main branch, last one 2 months ago
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Created
2023-12-03
227 commits to main branch, last one 3 days ago
Aligning Large Language Models with Human: A Survey
Created
2023-07-23
50 commits to main branch, last one about a year ago
聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)
Created
2023-04-30
9 commits to main branch, last one about a year ago
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
Created
2023-05-25
537 commits to main branch, last one 5 months ago
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
Created
2021-03-18
70 commits to main branch, last one 8 months ago
The official implementation of Self-Play Preference Optimization (SPPO)
Created
2024-06-13
27 commits to main branch, last one about a month ago
RewardBench: the first evaluation tool for reward models.
Created
2023-12-23
211 commits to main branch, last one 24 days ago
A recipe for online RLHF and online iterative DPO.
Created
2024-05-10
33 commits to main branch, last one 8 days ago
MindSpore online courses: Step into LLM
Created
2023-03-21
126 commits to master branch, last one 2 months ago
Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.
Created
2023-12-24
983 commits to main branch, last one a day ago
pykoi: Active learning in one unified interface
Created
2023-07-14
595 commits to main branch, last one about a year ago