14 results found Sort:
- Filter by Primary Language:
- Python (9)
- Jupyter Notebook (3)
- Makefile (1)
- +
Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Basically ChatGPT but with PaLM
Created
2022-12-09
128 commits to main branch, last one 9 months ago
A curated list of reinforcement learning with human feedback resources (continually updated)
Created
2023-02-13
63 commits to main branch, last one 9 days ago
Open-source pre-training implementation of Google's LaMDA in PyTorch. Adding RLHF similar to ChatGPT.
This repository has been archived
(exclude archived)
Created
2022-06-21
88 commits to main branch, last one 8 months ago
Let's build better datasets, together!
Created
2024-03-11
113 commits to main branch, last one 3 months ago
Implementation of Reinforcement Learning from Human Feedback (RLHF)
Created
2022-12-28
76 commits to main branch, last one about a year ago
[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"
Created
2023-11-23
37 commits to main branch, last one 7 months ago
The ParroT framework to enhance and regulate the Translation Abilities during Chat based on open-sourced LLMs (e.g., LLaMA-7b, Bloomz-7b1-mt) and human written translation and evaluation data.
Created
2023-03-22
177 commits to master branch, last one about a year ago
Product analytics for AI Assistants
Created
2022-01-19
939 commits to main branch, last one 5 months ago
BeaverTails is a collection of datasets designed to facilitate research on safety alignment in large language models (LLMs).
Created
2023-06-14
3 commits to main branch, last one about a year ago
Dataset Viber is your chill repo for data collection, annotation and vibe checks.
Created
2024-08-07
181 commits to main branch, last one 2 months ago
The Prism Alignment Project
Created
2024-03-06
12 commits to main branch, last one 6 months ago
[ECCV2024] Towards Reliable Advertising Image Generation Using Human Feedback
Created
2024-07-04
41 commits to main branch, last one 2 months ago
[ICML 2024] Code for the paper "Confronting Reward Overoptimization for Diffusion Models: A Perspective of Inductive and Primacy Biases"
Created
2024-05-19
6 commits to main branch, last one 3 months ago
Code for the paper "Aligning LLM Agents by Learning Latent Preference from User Edits".
Created
2024-04-25
14 commits to main branch, last one 16 days ago