78 results found Sort:
- Filter by Primary Language:
- Python (60)
- Jupyter Notebook (6)
- HTML (2)
- TypeScript (2)
- Makefile (1)
- TeX (1)
- +
OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.
Created
2022-12-13
3,477 commits to main branch, last one 11 months ago
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Created
2023-05-28
2,428 commits to main branch, last one 4 hours ago
The official GitHub page for the survey paper "A Survey of Large Language Models".
Created
2023-03-14
139 commits to main branch, last one 3 months ago
中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)
Created
2023-07-18
264 commits to main branch, last one 2 months ago
Official release of InternLM2.5 base and chat models. 1M context support
Created
2023-07-06
235 commits to main branch, last one 12 days ago
Robust recipes to align language models with human and AI preferences
Created
2023-08-25
106 commits to main branch, last one 12 days ago
Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets
Created
2021-04-28
3,579 commits to develop branch, last one 9 hours ago
Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调
This repository has been archived
(exclude archived)
Created
2023-04-08
287 commits to main branch, last one about a year ago
A curated list of reinforcement learning with human feedback resources (continually updated)
Created
2023-02-13
69 commits to main branch, last one 2 days ago
A Doctor for your data
Created
2023-05-02
32 commits to master branch, last one 10 months ago
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
Created
2023-10-16
778 commits to main branch, last one about a month ago
WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)
Created
2023-05-28
28 commits to main branch, last one about a year ago
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
Created
2023-05-25
589 commits to main branch, last one 22 days ago
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
Created
2023-05-15
111 commits to main branch, last one 5 months ago
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Created
2023-04-01
42 commits to main branch, last one 2 months ago
Recipes to train reward model for RLHF.
Created
2024-03-21
124 commits to main branch, last one 14 days ago
Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.
Created
2022-08-08
1,033 commits to main branch, last one 15 days ago
A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).
Created
2023-12-03
171 commits to main branch, last one 13 days ago
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
Created
2024-05-21
78 commits to main branch, last one 28 days ago
Aligning Large Language Models with Human: A Survey
Created
2023-07-23
50 commits to main branch, last one about a year ago
聚宝盆(Cornucopia): 中文金融系列开源可商用大模型,并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)
Created
2023-04-30
9 commits to main branch, last one about a year ago
Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.
Created
2023-05-25
537 commits to main branch, last one 4 months ago
Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)
Created
2021-03-18
70 commits to main branch, last one 6 months ago
The official implementation of Self-Play Preference Optimization (SPPO)
Created
2024-06-13
27 commits to main branch, last one 10 days ago
RewardBench: the first evaluation tool for reward models.
Created
2023-12-23
210 commits to main branch, last one about a month ago
A recipe for online RLHF and online iterative DPO.
Created
2024-05-10
32 commits to main branch, last one 25 days ago
MindSpore online courses: Step into LLM
Created
2023-03-21
126 commits to master branch, last one about a month ago
Open Source Application for Advanced LLM Engineering: interact, train, fine-tune, and evaluate large language models on your own computer.
Created
2023-12-24
939 commits to main branch, last one a day ago
pykoi: Active learning in one unified interface
Created
2023-07-14
595 commits to main branch, last one 11 months ago
LLM Tuning with PEFT (SFT+RM+PPO+DPO with LoRA)
Created
2023-06-12
28 commits to main branch, last one about a year ago