Search Results - RepositoryStats

LLaMA-Factory hiyouga

4.6k

37.3k

apache-2.0

219

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Created 2023-05-28

2,533 commits to main branch, last one 22 hours ago

Open-Assistant LAION-AI

3.3k

37.2k

apache-2.0

433

OpenAssistant is a chat-based assistant that understands tasks, can interact with third-party systems, and retrieve information dynamically to do so.

ai rlhf nextjs python chatgpt assistant discord-bot language-model machine-learning

Created 2022-12-13

3,477 commits to main branch, last one 12 months ago

LLMSurvey RUCAIBox

833

10.7k

unknown

159

The official GitHub page for the survey paper "A Survey of Large Language Models".

llm llms rlhf chatgpt pre-training chain-of-thought instruction-tuning in-context-learning large-language-models natural-language-processing pre-trained-language-models

Created 2023-03-14

139 commits to main branch, last one 4 months ago

Chinese-LLaMA-Alpaca-2 ymcui

580

7.1k

apache-2.0

78

中文LLaMA-2 & Alpaca-2大模型二期项目 + 64K超长上下文模型 (Chinese LLaMA-2 & Alpaca-2 LLMs with 64K long context models)

64k llm nlp rlhf yarn llama alpaca llama2 alpaca2 llama-2 alpaca-2 flash-attention large-language-models

Created 2023-07-18

264 commits to main branch, last one 3 months ago

InternLM InternLM

464

6.6k

apache-2.0

59

Official release of InternLM2.5 base and chat models. 1M context support

gpt llm rlhf chatbot chinese long-context fine-tuning-llm flash-attention pretrained-models large-language-model

Created 2023-07-06

235 commits to main branch, last one about a month ago

alignment-handbook huggingface

420

4.9k

apache-2.0

111

Robust recipes to align language models with human and AI preferences

llm rlhf transformers

Created 2023-08-25

106 commits to main branch, last one about a month ago

argilla argilla-io

389

4.2k

apache-2.0

32

Argilla is a collaboration tool for AI engineers and domain experts to build high-quality datasets

ai llm nlp rlhf gpt-4 mlops langchain text-labeling active-learning annotation-tool developer-tools text-annotation machine-learning weak-supervision human-in-the-loop weakly-supervised-learning natural-language-processing

Created 2021-04-28

3,603 commits to develop branch, last one 15 days ago

ChatGLM-Efficient-Tuning hiyouga

473

3.7k

apache-2.0

32

Fine-tuning ChatGLM-6B with PEFT | 基于 PEFT 的高效 ChatGLM 微调

lora peft rlhf qlora alpaca chatglm chatgpt pytorch chatglm2 fine-tuning huggingface transformers language-model

This repository has been archived (exclude archived)

Created 2023-04-08

287 commits to main branch, last one about a year ago

awesome-RLHF opendilab

219

3.6k

apache-2.0

61

A curated list of reinforcement learning with human feedback resources (continually updated)

rlhf deep-learning human-feedback large-language-models reinforcement-learning deep-reinforcement-learning

Created 2023-02-13

70 commits to main branch, last one about a month ago

docta Docta-ai

209

3.3k

other

125

A Doctor for your data

data rlhf data-curation data-diagnosis language-model data-centric-ai data-centric-machine-learning

Created 2023-05-02

32 commits to master branch, last one 11 months ago

distilabel argilla-io

147

1.8k

apache-2.0

17

Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.

ai llms rlhf rlaif openai python huggingface synthetic-data synthetic-dataset-generation

Created 2023-10-16

780 commits to main branch, last one 12 days ago

alpaca_eval tatsu-lab

249

1.6k

apache-2.0

8

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

nlp rlhf evaluation leaderboard deep-learning foundation-models instruction-following large-language-models

Created 2023-05-25

595 commits to main branch, last one 8 days ago

WebGLM THUDM

137

1.6k

apache-2.0

25

WebGLM: An Efficient Web-enhanced Question Answering System (KDD 2023)

llm rlhf webglm chatgpt

Created 2023-05-28

29 commits to main branch, last one 22 days ago

safe-rlhf PKU-Alignment

119

1.4k

apache-2.0

18

Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

Created 2023-05-15

111 commits to main branch, last one 6 months ago

MOSS-RLHF OpenLMLab

101

1.3k

apache-2.0

35

Secrets of RLHF in Large Language Models Part I: PPO

rlhf ai-safety alignment

Created 2023-07-05

47 commits to main branch, last one 10 months ago

ImageReward THUDM

65

1.2k

apache-2.0

15

[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

rlhf diffusion-models generative-model human-preferences

Created 2023-04-01

42 commits to main branch, last one 3 months ago

RLHF-Reward-Modeling RLHFlow

76

1.1k

apache-2.0

21

Recipes to train reward model for RLHF.

llm rlhf llama3 reward-models

Created 2024-03-21

128 commits to main branch, last one 24 days ago

xtreme1 xtreme1-io

151

929

apache-2.0

31

Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.

rlhf annotation multimodal point-cloud 3d-annotation labeling-tool annotation-tool computer-vision image-annotation lidar-annotation lidar-camera-fusion image-classification image-labelling-tool lidar-object-tracking lidar-object-detection

Created 2022-08-08

1,033 commits to main branch, last one about a month ago

SimPO princeton-nlp

52

781

mit

9

[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward

rlhf alignment preference-alignment large-language-models

Created 2024-05-21

78 commits to main branch, last one 2 months ago

HALOs ContextualAI

48

772

apache-2.0

7

A library with extensible implementations of DPO, KTO, PPO, ORPO, and other human-aware loss functions (HALOs).

dpo kto ppo rlhf halos alignment

Created 2023-12-03

227 commits to main branch, last one 3 days ago

AlignLLMHumanSurvey GaryYufei

31

709

unknown

31

Aligning Large Language Models with Human: A Survey

llms rlhf gpt-4 llama llama2 survey awesome chatgpt chinese-llama large-language-models supervised-finetuning

Created 2023-07-23

50 commits to main branch, last one about a year ago

Cornucopia-LLaMA-Fin-Chinese jerry1993-tech

63

602

apache-2.0

5

聚宝盆(Cornucopia): 中文金融系列开源可商用大模型，并提供一套高效轻量化的垂直领域LLM训练框架(Pretraining、SFT、RLHF、Quantize等)

qa nlp sft rlhf llama chinese finance transformers text-generation large-language-models

Created 2023-04-30

9 commits to main branch, last one about a year ago

LLamaTuner jianzhnie

63

586

apache-2.0

9

Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.

dpo ppo qwen rlhf llama qlora llama3 chatgpt mixtral

Created 2023-05-25

537 commits to main branch, last one 5 months ago

TextRL voidful

59

547

mit

11

Implementation of ChatGPT RLHF (Reinforcement Learning with Human Feedback) on any generation model in huggingface's transformer (blommz-176B/bloom/gpt/bart/T5/MetaICL)

nlg nlp rlhf gpt-2 gpt-3 chatgpt pytorch controlled-nlg language-model reinforcement-learning

Created 2021-03-18

70 commits to main branch, last one 8 months ago